Many of you are no doubt well familiar with the tremendous resource, the Internet Archive housed at
Archive.org. The Archive exists for the same reason as any other - to preserve information from the past and present for future generations. The web by its very nature is a fickle and transitory medium, with no centralized way to perserve aging or obselete websites and webcontent. The Internet Archives
wants to change that:
Libraries exist to preserve society’s cultural artifacts and to provide access to them. If libraries are to continue to foster education and scholarship in this era of digital technology, it’s essential for them to extend those functions into the digital world...
...[W]ithout cultural artifacts, civilization has no memory and no mechanism to learn from its successes and failures. And paradoxically, with the explosion of the Internet, we live in what Danny Hillis has referred to as our "digital dark age."
The Internet Archive is working to prevent the Internet — a new medium with major historical significance — and other "born-digital" materials from disappearing into the past. Collaborating with institutions including the Library of Congress and the Smithsonian, we are working to preserve a record for generations to come.
Open and free access to literature and other writings has long been considered essential to education and to the maintenance of an open society. Public and philanthropic enterprises have supported it through the ages. The Internet Archive is opening its collections to researchers, historians, and scholars... [and]...we are hopeful about the development of tools and methods that will give the general public easy and meaningful access to our collective history.
One of the central elements of the Internet Archive's preservation effort is the regular review and storage of pages from around the web - accessible through what they call
The Wayback Machine. Over time a variety of versions of a website are maintained and a user can search through the archive to review, say, what I.B.M.'s website looked like in January 1998 alongside the current incarnation. Here is the
current homepage for
Daily Caveat parent firm
Caveat Research and here is the
internet archive of our
original hompage.
You can take a look at your own company's web history, via The Wayback Machine,
via this link.
It is the use (and alleged mis-use) of this tool that has landed Pennsylvania law firm
Harding, Earley, Follmer & Frailey in potentially hot water. Via law.com:
In the suit, Healthcare Advocates Inc. v. Harding Earley Follmer & Frailey, et al., the plaintiff claims that during the discovery phase of a prior lawsuit, employees of the law firm violated the Digital Millennium Copyright Act and the Computer Fraud and Abuse Act.
Plaintiff's attorneys Peter J. Boyer and Scott S. Christie of McCarter & English allege in the suit that employees of Harding Earley were aware that the archives of Healthcare Advocates' Web site were blocked, but "successfully hack[ed]" into the system and "finally managed to successfully circumvent the security."
Also named as a defendant in the suit is Internet Archive, a nonprofit organization in San Francisco that, according to the suit, was "founded to build an 'Internet library' with the purpose of offering researchers, historians, and scholars permanent access to historical collections that exist in digital format."
The suit alleges that Internet Archive had agreed to block public access to the archived historical content of Healthcare Advocates, but "failed to perform its duty." In the 12-count complaint, the law firm is also accused of copyright infringement, civil conspiracy, trespass to chattels, trespass for conversion and intrusion upon seclusion. Internet Archive is being sued for breach of contract, promissory estoppel, breach of fiduciary duty, negligent dispossession and negligent misrepresentation.
In an interview Tuesday, plaintiff's attorney Christie -- a former federal prosecutor in the District of New Jersey where he headed the computer hacking and intellectual property section -- said the case is about the right of Web site owners to protect their copyrighted material by insisting that archive sites block access.
As for the law firm he sued, Christie said their actions were "antithetical to the way lawyers are expected to conduct themselves," and that, as specialists in the area of intellectual property, "they should have known better." In the opening paragraphs of the suit, the plaintiff's lawyers contend there were "at least 92 separate acts of unauthorized electronic access" of Healthcare Advocates' Web archives committed by "partners, associates, legal assistants and other employees of [the] Harding Earley law firm."
According to the suit, Healthcare Advocates is "a pioneer in the patient advocacy field," and "assists the public in securing, paying for, and receiving reimbursement for necessary health care." In June 2003, Healthcare Advocates filed a copyright and trademark infringement suit against a competitor, Health Advocate Inc. The suit was dismissed in February 2005 when Senior U.S. District Judge Robert F. Kelly concluded that the plaintiff's claimed trademark was not entitled to protection because it had not attained any "secondary meaning."
That ruling is now on appeal.
But in the new suit, Healthcare Advocates claims that during the discovery phase of the prior suit, Harding Earley -- an intellectual property boutique in Valley Forge, Pa. -- was hired by one of the defendants and set out to hack into its Web archives.
The suit also alleges that Internet Archive promised that the archives would be blocked from public view, but failed to deliver on that promise. According to the suit, Internet Archive "archives the content of publicly accessible Internet Web sites that otherwise would disappear as it is updated, removed by its owners, or otherwise ceases to exist in an effort to preserve the cultural and historical value of this material."
On its site, the suit says, Internet Archive allows visitors to use its "Wayback Machine" to access any of its archived Web sites "as it existed on any or all of the capture dates as far back as 1996." The suit also says Internet Archive has an "exclusion policy" that allows Web site owners to block public access. To take advantage of the policy, the suit says, Web site owners are instructed to install a file named "robots.txt" on their servers.
"Internet Archive represented to Web site owners that as long as the denial text string was properly installed in the robots.txt file of the computer server hosting their Web sites, defendant Internet Archive would prevent individuals from gaining access to the archived historical content for their Web sites via the Wayback Machine," the suit says.
The suit alleges that when employees of Harding Earley at first attempted to access the archives, they were confronted with a screen that said: "We're sorry, access to http://www.healthcareadvocates.com has been blocked by the site owner via robots.txt." But instead of simply giving up, the suit alleges, Harding Earley set out to hack its way past the site's security mechanism...
Read the rest here.
-- MDT