The Problem with Archive.org

Update: Well, it looks like the post by Adam was actually really old. (Jul 2) — why is Google Reader showing me feeds this far back…? Hmm. Anyways, I still hold my opinion.

Adam Howell hit’s it right on the nose with his latest post. Archive.org’s Wayback machine should store full page images instead of HTML code. So many sites — some dating back to over ten years ago! — are borked because of missing images and broken code.

A few problems with this idea:

  1. What about fully Flashed based web sites, where a single screenshot wouldn’t generally do the site justice. (However in that case you could just default back to HTML)
  2. How to preserve navigation? Some sort of image map? Sounds tricky though, maybe offering images alongside a sort of barebones HTML would help.
  3. What about text? See above.

It’s too late to save the ones that have come and gone, but for future storage I can see this as very possible solution.

5 Comments

  1. Posted November 7, 2007 at 12:59 am | Permalink

    Hehe, it was my fault it bubbled back to the top, I did a little bit of reorganizing/redesigning and weeded out a bunch of old, useless posts.

    Re: preserving navigation, etc. Actually my thought was that, for most sites, just a screenshot of the homepage every month would be good enough — at least enough to preserve the design (which is where that post was biased towards, more design archival than content archival).

    I just really want something that preserves design on the web for all time, and screenshots (of both Flash and non-Flash sites) would work so well, it’s sad there’s not something like that out there.

  2. Posted November 7, 2007 at 2:18 pm | Permalink

    Oh, thanks for clearing that up! Hehe, I thought Google Reader was stuck in a time warp or something.

    Anyway yeah, design-wise screenshots are great. I mean, It would be pretty cool to put together an evolution of some long-standing website with just a few mouse clicks. And as time goes by (and the web gets older) this would become increasingly more valuable and interesting.

  3. Posted November 8, 2007 at 4:08 pm | Permalink

    That is exactly the problem that I had with Archive.
    org: It doesn’t store images.

    Personally, I think that a screenshot would be enough, yet I think we have to work on how these sreenshots are preserved and stored. A while ago I came up with concept that I call Graphical Bookmarking and I am currently working on implementing it. It will take a while before anything is functional, cut I would love to hear what you think as you seem to have the same problems/demands as me.

    P.S: I like your site a lot. The only thing that I am missing is a Preview button.

  4. Posted November 9, 2007 at 12:15 pm | Permalink

    Thanks Dominik! I’ve posted my thoughts on your blog.

    P.S. Assuming you use Wordpress, how do add a preview button. (I’m guessing it’s a plugin, eh? I haven’t looked, what do you use?)

  5. Posted November 9, 2007 at 12:51 pm | Permalink

    Thanks for the feedback, Hamish.
    Personally, I do not use Wordpress, but Expression Engine.
    However, I just googled a bit and it seems that the functionality was removed from Wordpress due to performance reasons… (?)
    It seems like you have a live comment preview plugin such as this one.

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*