Flick Harrison on Thu, 7 Jan 2010 11:38:26 +0100 (CET) |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: <nettime> fast-changing propaganda website archiving tools? |
Thanks for the responses, people. I've tested out the tools suggested and still encounter the same probs. Maybe this is sliding from a search for tools into a techie discussion... but more tools would be appreciated! On the Sri Lankan / Tamil campaign site, for instance, http://www.defence.lk/orbat/Default.asp wget-based interfaces don't follow links within the flash files, nor (i think) localize the links within flash animations. So clicking on "show photos" or "show animation" in the downloaded version, for instance, doesn't work. I tried wgetting the entire SL Ministry of Defence site and after 6 hours and 6.4 gigs of downloads, the downloaded version of the interactive battle map still doesn't work. On this other site, http://www.thisisdion.ca/Htmlsite/_old_html_index.html Javascript-type popup links (i.e. onClick="MyWindow=window.open('meetDionpop.html' ) don't get followed by wget, even if it's told to follow links. I solved that one by wgetting everything in the domain, then clicking the original popups one by one in firefox and saving them as "web page, complete" in the same directory. For a simple couple of pages that's do-able, but it seems error-prone (i.e. localization would get very confused). Thanks, Flick A summary of the suggestions: On 5-Jan-10, at 01:08 , Michael van Schaik wrote: > On mac I've had good results using sitesucker.app > http://www.sitesucker.us/ > > I has a GUI and can be configured to eg. download infinitely but only > from one domain. On 4-Jan-10, at 14:12 , Chris wrote: > I use HTTrack, I dunno if it does Flash, you might also need to write > some shell script wrappers for it: > > http://www.httrack.com/ > > It's GPL'd and in debian, I have never used the GUI interface... ;-) On 4-Jan-10, at 13:47 , Karin Spaink wrote: > You might want to try DeepVacuum. It works with wget but it has a > nice user interface, and it's built for the Mac: http://www.hexcat.com/ * FLICK's WEBSITE & BLOG: http://www.flickharrison.com * FACEBOOK http://www.facebook.com/profile.php?id=860700553 * MYSPACE: http://myspace.com/flickharrison # distributed via <nettime>: no commercial use without permission # <nettime> is a moderated mailing list for net criticism, # collaborative text filtering and cultural politics of the nets # more info: http://mail.kein.org/mailman/listinfo/nettime-l # archive: http://www.nettime.org contact: nettime@kein.org