Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is like scraping Amazons reviews since they are available via Google and claiming you can display that user submitted content how ever you like. I can't see 3Taps having much of any case here.


I don't know of any precedent that gives sites that accept user-generated content grounds to sue when their users' copyrights are violated. I would like to hear of one.


Seconded.

Certainly this issue has come up before. Maybe the fact there is no precedent tells us something? Who would benefit from keeping the issue undecided?


If you're scraping user-submitted content by accessing (e.g.) CraigsList, then I would think that CraigsList could try and claim that it was unauthorized access of their systems.


Agreed, but my understanding is that 3Taps pulled the listings via Google cache.


It's not at all like that. The interesting thing seems to be the claim that Craigslist posts lack creativity, which Amazon reviews certainly do have. 3Taps is claiming the posts can't be protected by copyright, because you can't copyright facts. I think their case has merit.


My understanding was that, as the owner of a site, (even one that hosts mostly/completely content created by someone else), you get to dictate the terms on which any party accesses that site. It would be like if YouTube said that you may not access their site with anything but a web browser.

Strange that they went the copyright way, though. IANAL, but isn't accessing a service you've been explicitly forbidden to (via ToS or similar) a violation of the CFAA?


3Taps doesn't access Craigstlist so the TOS has no effect.


You are 100% wrong. The ad copy on craigslist IS copyright protected so it doesn't matter where you scraped it from. TOSs do govern the data whether or not you are viewing a cached copy or not. By your logic all TOSs are rendered useless if you browse the net via proxy servers.

You can scrape anything off the internet you like and do as you please...but you can't create a business out of it.


Site TOS don't really matter too much or else Google could never exist. They're mostly to limit the liability of the publisher and not to prevent anything with the accessor. Googlebot is not a lawyer, it can't decide the legal implications of scraping or what exact activity can be done with the data it finds.

Google also overlays facts it finds by crawling the web in maps (Google Places and the One Box results) and uses the creative contribution of other people to provide things like reviews inline on SERP. Google is a bad target to sue though because they will punch you right back in the face.


"You can scrape anything you like off the internet and do as you please... but you can't create a business out of it."

I tried to tell this to the Google guys in the 1990's but they didn't listen! :)


It is not black and white, despite your assertion. It should be tested in court.


Are 3Taps verifying by hand that every scraped post are only facts? I've seen some pretty creative CL posts. Also it's been mentioned that assemblages of facts can in fact be copyrighted. See the Farmers Almanac.


You're right about that, and I think that's going to make their claim a bit unsteady. But I see an argument they might make:

1. Classified ads are facts, with negligible creativity put into their composition. 2. Facts are not copyrightable. 3. Therefore, 3Taps can scrape the ads. 4. If any ad actually does have creativity / an applicable copyright, the copyright holder can contact 3taps with a complaint.

Basically a "safe harbor" take on the whole thing. What do you think?


Feist v. Rural (http://en.wikipedia.org/wiki/Feist_v._Rural) seems to indicate the the arrangement of the facts is copyrightable, (if there is even a minimal degree of creativity) even though the facts themselves are not. The court held the alphabetic arrangement of names is not sufficient to warrant copyright protection, and it seems logical that mere temporal arrangement of classifieds would not merit protection either. However, it seems as though 3Taps would need to find a way to rearrange the facts presented within the advertizement, because that certainly is a creative process.

The seller has the freedom to write text a description and include pictures. Photographs have a long history of copyright protection, as does written work. Simply copying the specification sheet for a particular model of TV into the ad, for example, might not be copyrightable, but how does 3Taps sort those from the rest?

I just stole this from a random Craigslist ad:

"Black futon in good condition. The mattress is a lot thicker than most futons. It is pretty easy to assemble. You must be able to pick it up though because I do not have a truck. I am moving in a week so I need to get rid of it, asking for 100 or better offer. Please use the link above to email me. "

That the object is a futon, that said futon is black, thicker than average, easy to assemble and in good condition are all facts. The seller could have presented those facts in any number of ways, but he or she chose this way (proper spelling, for example, and mostly written in complete sentences) because he or she thought that would generate a better response. The only way I can see 3Taps on solid ground is if they can take the ad text, use it to generate a set of facts, and rewrite the ad from that (analogous to the way the PC BIOS was reverse-engineered, but done mechanically (I can't imagine it would scale well to have humans rewrite the ads))


Excellent points, especially regarding photographs. I was thinking of that case already, and hadn't considered the arrangement of facts within a post; what came to mind first was the collection of posts as their own arrangement of facts.


http://en.wikipedia.org/wiki/Web_scraping#Legal_issues

Hard to say what will happen. They will sure as shit get sued, but what will happen is up in the air as far as I can tell. If I remember correctly, if you are somehow changing or making new content from the data, then it is protected under fair use somehow.


I don't see how 3Taps is any different a client than google is. Now, i dont know enough details, so i m not saying 3tap has a case. But, if they scrape data off craigslist, not via their api (to which they agreed to abide by terms and conditions), then they _do_ seem to have a case!


Is 3Taps different from Google simply because they show a map?

Maybe CL should have terms for their general website that say data can be scraped and cached but that it must appear in the cache either exactly as it does on CL, or with some things missing, but not with anything added (e.g. a map).


3taps isn't a client application for Craigslist, they essentially turn CL's data into a feed and then sell it to folks who then create services on top of that data.

Padmapper is a client of 3taps who's been the client app that most of this is focused on.

BTW 3taps doesn't scrape CL, they actually pull CL's data from Google's Cache so they can't be accused of violating CL's TOS.


Yeah, it would be futile for CL to go after Padmapper, so they are going after 3Taps. Why would they be more worried about 3Taps? Because anyone can do what 3Taps is doing. Anyone can pull CL data from a search engine cache like 3Taps does. There's no need to go to CL.

But one thing we know, if you look at the precedent, is that you can't sue Google and win. You either let them copy your site or opt out and lose the traffic. So what can you do?

Sue Google users! As far as I'm concerned, that's all 3Taps is. They are just aggregating what they pull from the cache.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: