Simple python & go scripts to access static pages from old website…
Old Wordpress website exported as static files, served by local script:

I used Wordpress as a CMS for many years, replaced that with a HUGO generated static site. But…
But I still have an exported static version of the Wordpress site, in which I have older posts I’d like to review and bring into the new site if they are still relevant…
I looked into pushing the static site up into S3 (it was already temporarily there). But accessing it directly failed, because it has static hard-coded URLs that don’t match the S3 address and fail to work.
What I needed was a simple local web server that could serve the local static site, and rewrite the html URLs (but not the css and js) from “https://dougmunsinger.com
” to “http://localhost:8080” on the fly.
Either python or go would work. Here’s the python version.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
| import http.server
import socketserver
import os
import io
PORT = 8080
REWRITE_FROM = ["https://dougmunsinger.com", "http://dougmunsinger.com"]
REWRITE_TO = "http://localhost:8080"
class RewriteHTTPRequestHandler(http.server.SimpleHTTPRequestHandler):
def send_head(self):
path = self.translate_path(self.path)
if os.path.isdir(path):
index_path = os.path.join(path, "index.html")
if os.path.exists(index_path):
path = index_path
else:
self.send_error(404, "Directory index not found")
return None
ext = os.path.splitext(path)[1]
if ext == ".html":
try:
with open(path, 'r', encoding='utf-8') as f:
content = f.read()
for old in REWRITE_FROM:
content = content.replace(old, REWRITE_TO)
encoded = content.encode('utf-8')
self.send_response(200)
self.send_header("Content-type", "text/html; charset=utf-8")
self.send_header("Content-Length", str(len(encoded)))
self.end_headers()
return io.BytesIO(encoded)
except FileNotFoundError:
self.send_error(404, "File not found")
return None
return super().send_head()
def copyfile(self, source, outputfile):
try:
super().copyfile(source, outputfile)
except BrokenPipeError:
print("[WARN] Client closed connection early (Broken pipe)")
def log_message(self, format, *args):
print(f"[REQUEST] {self.path} → {args[0]}")
if __name__ == "__main__":
os.chdir(os.path.dirname(os.path.abspath(__file__)))
with socketserver.TCPServer(("", PORT), RewriteHTTPRequestHandler) as httpd:
print(f"🚀 Serving at http://localhost:{PORT}")
httpd.serve_forever()
|
Running…
1
2
3
4
5
6
7
8
9
| spence:dougmunsinger.com dsm$ python server_rewrite.py
🚀 Serving at http://localhost:8080
[REQUEST] /2024/11/05/terraform-and-personal-websites/ → GET /2024/11/05/terraform-and-personal-websites/ HTTP/1.1
[REQUEST] /wp-content/uploads/2024/10/cropped-20110921_Canon_P_100mm_Test_058-e1729632949838-32x32.jpg → GET /wp-content/uploads/2024/10/cropped-20110921_Canon_P_100mm_Test_058-e1729632949838-32x32.jpg HTTP/1.1
[WARN] Client closed connection early (Broken pipe)
[REQUEST] / → GET // HTTP/1.1
[REQUEST] /wp-content/uploads/2024/10/cropped-20110921_Canon_P_100mm_Test_058-e1729632949838-192x192.jpg → GET /wp-content/uploads/2024/10/cropped-20110921_Canon_P_100mm_Test_058-e1729632949838-192x192.jpg HTTP/1.1
[WARN] Client closed connection early (Broken pipe)
[REQUEST] /page/2/ → GET /page/2/ HTTP/1.1
|
And go version.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
| package main
import (
"fmt"
"io"
"net/http"
"os"
"path/filepath"
"strings"
)
const (
port = ":8080"
rootDir = "." // serve from current directory
oldURL1 = "https://dougmunsinger.com"
oldURL2 = "http://dougmunsinger.com"
replacement = "http://localhost:8080"
)
func main() {
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
logRequest(r)
// Get requested file path
path := filepath.Join(rootDir, filepath.Clean(r.URL.Path))
info, err := os.Stat(path)
if err != nil || info.IsDir() {
// Try to serve index.html for directories
indexPath := filepath.Join(path, "index.html")
info, err = os.Stat(indexPath)
if err == nil && !info.IsDir() {
serveFileWithRewrite(w, indexPath)
return
}
http.NotFound(w, r)
return
}
// HTML files get rewritten
if strings.HasSuffix(path, ".html") {
serveFileWithRewrite(w, path)
return
}
// Serve other files directly
http.ServeFile(w, r, path)
})
fmt.Println("Serving on http://localhost" + port)
err := http.ListenAndServe(port, nil)
if err != nil {
fmt.Println("Server error:", err)
}
}
func serveFileWithRewrite(w http.ResponseWriter, path string) {
content, err := os.ReadFile(path)
if err != nil {
http.Error(w, "File read error", http.StatusInternalServerError)
return
}
rewritten := strings.ReplaceAll(string(content), oldURL1, replacement)
rewritten = strings.ReplaceAll(rewritten, oldURL2, replacement)
w.Header().Set("Content-Type", "text/html; charset=utf-8")
w.WriteHeader(http.StatusOK)
io.WriteString(w, rewritten)
}
func logRequest(r *http.Request) {
fmt.Printf("%s - [%s] %s %s\n", r.RemoteAddr, r.Method, r.URL.Path, r.UserAgent())
}
|
Running…
1
2
3
4
5
6
7
| spence:dougmunsinger.com dsm$ go run go_server.go
Serving on http://localhost:8080
127.0.0.1:59363 - [GET] / Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0
127.0.0.1:59363 - [GET] /page/2/ Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0
127.0.0.1:59363 - [GET] / Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0
127.0.0.1:59363 - [GET] /page/2/ Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0
127.0.0.1:59363 - [GET] /2024/11/05/terraform-and-personal-websites/ Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:140.0) Gecko/20100101 Firefox/140.0
|
This makes it possible to just read the static pages in a browser. Exported from Wordpress, with kludge from menus and javascript and random css, reading the exported content directly as text is a very difficult task. Much easier when presented as a webpage.
Example exported static file content (pdf).
Both web servers worked beautifully. Go version is significantly faster even run uncompiled. Compiled, the go binary is lightning fast.
Now I can delete the old S3 content and use it as a staging area…
—doug