← /writing #meta#archive#blogging#lost

The Lost Archive — 58 posts that didn't survive the WordPress reinstall

When I rebuilt this blog in 2026 from a cPanel SQL dump, 27 posts came back clean. 58 did not. Here is the memorial — the slugs, the years, the topics — for posts I wrote between 2010 and 2013 that exist now only in Google's sitemap and the Wayback Machine.

When I migrated this blog from WordPress to its current setup, I pulled posts out of the cPanel SQL dump (wp_posts table). 27 posts came through clean — 2008 and 2009 mostly, the college years.

Then I checked the old XML sitemap.

It listed 58 more URLs that the database no longer had rows for. Posts I wrote between 2010 and 2013 — mostly during my Symantec years, the pre-Perk.com Bangalore-relocation phase, the period right before this blog effectively went dormant for a decade.

Somewhere along the way — a WordPress reinstall, a host migration, a database I dropped without thinking, a backup I didn’t take — these posts got severed from their content. The URLs persisted in the sitemap. The bodies did not.

This post is a record of what didn’t make it. It’s also a marker so that if I ever recover them from the Wayback Machine, I can wire them back into this site without having to re-derive the list.

the count

2010 ▸ 14 posts
2011 ▸ 22 posts
2012 ▸ 21 posts
2013 ▸  1 post
─────────────
total: 58

what they were about

Reading the slugs back — and I’m reading slugs because that’s all I have — three buckets emerge:

Trips. I traveled a lot in this period. Ooty, Mantralayam, Orissa, Srisailam, Nagarjuna Sagar, Tada Falls, Kailasakona Waterfalls, a New Year trip to Mumbai. Cheap weekends out of Hyderabad. Each one was a blog post.

Movie reviews. My Name is Khan, Vinnai Thandi Varuvaya, Sura, Ravanan, Endhiran/Robo, Mangatha. South Indian cinema, dutifully reviewed within a week of release. I was 22-25. Reviewing movies was a thing you did.

Tech and college pieces. Oracle functions, Norton Ninja security suite for Android, MongoDB on XAMPP, OpenShift from RedHat, batch scripts for periodic MySQL dumps, a paper presentation on “Security as a Service”, FOSS / LAMP architecture, a coaching session for school students. The Symantec era leaked through. So did the second MBA degree (the post titled “It’s now Karthikeyan, B.E., M.B.A.” — I remember writing that one).

Personal scribbles. Running out of time, The world after you die, Time is running, flying and swimming, Reality is a bullshit, really, Missing time, India and Indian educational system, Self social media policy. The 22-year-old’s existential phase, mostly. I would not write any of those today, but I would not delete them either.

the full list

Year by year, slug by slug. Each one a URL that resolves to a 404 today.

2010 (14)

  • /2010/02/google/
  • /2010/02/i-love-this-lyrics/
  • /2010/02/movie-review-my-name-is-khan/
  • /2010/03/movie-review-vinnai-thandi-varuvaya/
  • /2010/03/oracle-functions/
  • /2010/05/computers-without-harddisk/
  • /2010/05/movie-review-sura/
  • /2010/06/movie-review-ravanan/
  • /2010/07/lonely-trip-to-mantralayam/
  • /2010/09/orissa-trip/
  • /2010/10/movie-review-endhiranrobo/
  • /2010/10/older-designs/
  • /2010/10/srisailam-trip/
  • /2010/12/nagarjuna-sagar-trip/

2011 (22)

  • /2011/01/mumbai-trip-on-new-year-2011/
  • /2011/03/jai-telangana/
  • /2011/04/running-out-of-time/
  • /2011/06/coaching-session-for-school-students/
  • /2011/07/norton-ninja-security-suite-for-your-android/
  • /2011/07/tada-falls-trip-ubbalamadugu-falls/
  • /2011/08/blog-with-simple-and-clean-look/
  • /2011/08/kailasakona-waterfalls-trip/
  • /2011/08/piracy-easy-loop-hole-for-virus/
  • /2011/09/anna-hazare-co-an-hawk-eye-view/
  • /2011/09/information-retrieval-from-social-web/
  • /2011/09/movie-review-mangatha/
  • /2011/09/the-world-after-you-die/
  • /2011/09/time-is-running-flying-and-swimming/
  • /2011/10/266/
  • /2011/10/batch-script-for-taking-periodic-mysql-dumps/
  • /2011/10/i-was-always-on-facebook/
  • /2011/10/issue-with-wordpress/
  • /2011/10/its-time-for-art-work/
  • /2011/10/one-word-one-day/
  • /2011/10/self-social-media-policy/
  • /2011/12/nothing-devnull/

2012 (21)

  • /2012/02/12-12-12/
  • /2012/02/hiding-your-name-in-lovers-photo/
  • /2012/02/its-now-karthikeyan-b-e-m-b-a/
  • /2012/02/security-as-a-service-paper-presentation/
  • /2012/03/beware-of-using-the-f-word/
  • /2012/03/coaching-class-to-school-students/
  • /2012/03/crazy-reply-for-a-bug-report/
  • /2012/03/making-code-open-source/
  • /2012/04/openshift-from-redhat/
  • /2012/04/trip-to-ooty/
  • /2012/05/an-alternative-attitude/
  • /2012/05/firefox-banned-on-windows-8/
  • /2012/06/presentation-on-foss-and-lamp-architecture/
  • /2012/06/reality-is-a-bullshit-really/
  • /2012/06/running-out-of-money-time-and-sleep-for-the-month/
  • /2012/07/india-and-indian-educational-system/
  • /2012/07/mongo-db-with-php-using-xampp-in-windows/
  • /2012/07/the-books-i-read/
  • /2012/08/missing-time/
  • /2012/09/from-microsoft-windows-8-appfest-contest/
  • /2012/09/oorukku-pudhusa-new-to-the-city/

2013 (1)

  • /2013/01/india-and-its-culture/

That last one — January 2013 — is the final post the sitemap has on record before the blog effectively went dark for a decade. I moved to Bangalore around that time, joined Perk.com as a founding engineer, and the blog stopped being where I wrote things.

why I’m publishing the list

Three reasons.

One: a marker. If I ever recover a body from web.archive.org, I want a single page on this site that lists every URL I’m trying to reanimate, with stable anchors. This is that page.

Two: redirects. Astro lets me set up 301 redirects for old slugs. Now that the canonical list lives in version control, I can route any of these URLs to a recovered post, or to a stub explaining the loss, without having to re-derive the inventory each time.

Three: a small honesty. A blog with 14 years of date-stamps but no actual posts between 2009 and 2014 looks suspicious — like the archive was edited. It wasn’t. The archive was lost. There is a difference, and it deserves to be on the record.

the recovery plan

I built a Wayback recovery script earlier this year that takes a list of URLs, hits the Wayback CDX API, finds the best snapshot per URL, downloads the HTML, strips WordPress chrome, and emits a Markdown file with the original publish date in the front-matter. It works. It’s slow because Wayback rate-limits aggressively, but it works.

I haven’t run it on this list yet. I will, eventually. When a snapshot exists I’ll convert it. When a snapshot does not exist — and Wayback’s coverage of intrepidkarthi.com/blog/ in 2010-2011 is patchy — that post is just gone, and I’ll leave a stub at the original URL acknowledging it.

a note on what I learned

Backups are not optional, and “the host has backups” is not a backup. The reason these 58 posts are gone is that I trusted a sequence of WordPress reinstalls and shared-host backup retention policies. None of them held. Today everything I write goes through git into a repo I own, and the repo gets pushed to two remotes I control. That is the only backup strategy that has ever survived contact with my own future negligence.

If you have a blog older than ten years and you have not personally verified that you can rebuild it from a backup you control, this is your reminder. Go check. The lost archive doesn’t announce itself — it just quietly stops resolving one day, and you find out years later when you go looking for a post that used to be there.


addendum — the count is bigger than 58

After publishing this I went back through the original public_html/ dump more carefully and found a fuller sitemap.xml dated March 18, 2014. It contains 122 post URLs in total — not 85. Doing the arithmetic against what came through in the database (27 surviving posts) and what was on this site at publish time, the real lost-post count is closer to ~95, not 58. Breakdown of the additional URLs the deeper scan surfaced:

2007/12         ▸  4 posts (about-madurai, getting-bored-at-home,
                            memory-leak, my-book-app)
2008/01         ▸  1 post  (yo-yo-robo-on-d-track)
2009/04-12      ▸ 22 posts (technical: Python time calculator,
                            random-numbers in MySQL, Scheme programs
                            set 1/2/3, "what happens when you enter
                            a URL", how AJAX works, reverse a
                            linked list, session vs cookie, storing
                            fonts in SQL Server, …)
                  + a TCS-training-era set: tcs-days-starts,
                    hyderabad-session, kerala-session, gemini-info-way,
                    valley-beach-trip, then-mala-water-falls-trip
                  + movie reviews: aadhavan, vettaikaran
2010/01-02      ▸  4 Tamil-language posts (URLs UTF-8 encoded;
                    "எனது முதல் தமிழ் படைப்பு / my first Tamil work",
                    "உலக காதலர்களுக்கு / to the world's lovers",
                    "என்னவள் யாரோ / who is my beloved",
                    "நண்பா நண்பா / friend friend")
2013/02-11      ▸  7 NEW (kindle-fire-debugging, copying-contacts-
                            nokia-android, survival-of-the-fittest,
                            rabbitmq-setup-android, frequent-wlan-
                            disconnection-ubuntu-12-04, problems-of-
                            india, ban-bang-bangalore, one-day-at-
                            techcrunch-hackathon)
2014/01         ▸  2 posts (plans-2014, vodafone-is-reading-smsmms)

Plus three pages — /about-intrepidkarthi/, /wishlist/, /domains-for-sale/ — that the WordPress install had as standalone pages and that didn’t make it either.

The 2009-09 batch is particularly bittersweet. Nine short technical posts, all of them written during the TCS training period in Chennai, all of them the kind of “I just learned something, here is the explanation” piece that I would still write today. How AJAX works. Reverse a linked list. What happens when you enter a URL in the browser. The genre that became Stack Overflow content for an entire generation of Indian engineers. I was writing it in real time and the entire batch is gone.

Same lesson, sharper version: the archive doesn’t tell you it’s incomplete until you go looking. The first count was 58 because I stopped digging at the cPanel SQL dump. The real count is ~95 because the sitemap reaches deeper. There are probably more URLs in the Wayback Machine that aren’t in any sitemap I have. I don’t know. I won’t know until I look.


If you remember reading any of these — particularly the 2010 trip posts, the 2010 Tamil poetry, the Mangatha review, or any of the 2009-09 technical pieces — and have a copy in an old browser cache or an RSS reader’s archive, drop me a line. I’d like to put them back.

★ Achievement
NORMAL main ~/intrepidkarthi/writing/the-lost-archive-58-posts-that-didnt-survive.md · est. 2008 ● 3y+ streak utf-8 visitor #043,217