Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Deepdreaming without the Slugdogs (thehackerati.com)
63 points by geb on Sept 10, 2015 | hide | past | favorite | 11 comments


I find it interesting that these hallucinations do not maintain a consistent 3D perspective: the images look really flat and any noticeable three-dimensionality is localized to a small part of the image.

Intuitively, this makes some sense - one would expect an object classifier to not care too much about determining viewpoint, so the amplified representation of a dog or a slug is flat. I think convolution layers being the bottom-most layer also has something to do with it.

My dreams are a lot more perspective-correct though. Deepdream certainly entertains the idea that biological dreaming might be somehow similar to gradient ascent. Even if it were so, it means that the sensory experiences we feel in our dreams somehow integrate a much more unified "reality" than what we would experience if we were only dreaming with an object classifier.


In the brain, feature recognition and spatial analysis are handled by distinct neural pathways (ventral and dorsal, respectively, also known as the "what" and "where" pathways)[0].

Image recognition only emulates the former whose destruction causes a condition called "blindsight"[1]. Affected people lose the subjective sensation of seeing, and can't recognize anything, but they are still able to navigate in their environment, avoid new obstacles or put an envelope in a mailbox that's either vertical or horizontal without mistake.

0. https://en.wikipedia.org/wiki/Two-streams_hypothesis

1. https://en.wikipedia.org/wiki/Blindsight


Despite its name, the deepdream project seems more like an approximation of what happens in close-eyed mental visions you get from hallucinogens, especially the perspective flattening effect that you point out.


I agree that there appears to be a qualitative difference. But that does not necessarily mean that there is a difference between the fundamental architecture of the two systems. It may simply be that the objective function operating over the architectures is different. Perhaps if the only goal of a human mind was to classify images into pre defined categories it too would dream of puppy slugs. While I don't think the architectures are very similar, I think we have stumbled across an instance of a broad class of systems which all display these abilities. Ever read George's Marvellous Medicine by Roald Dahl? We are George having discovered magic by mixing random household ingredients together, amazed, joyful, but totally unable to explain the process by which we are achieving our results. I'll say it once, I'll say a million times - we should be looking to thermodynamics in open dissipative systems for our answers.


Well, the nets aren't trained on stereoscopic data so one shouldn't expect them to have very good depth perception.


I think the computers can only "express" (input or output) in terms of a 2D image, whereas it seems to me that human dreaming is in terms of abstract thoughts. We're not really dreaming in 3D, but dreaming in perception. If we could train a computer on 3D scenes in the same manner, it would be able to express its dreams in 3D, but we don't have the data to feed it.


We don't have the resources yet to train such a system on high resolution video. But it's probably coming soon. There are some works already in action recognition in videos.


YouTube videos?


This one definitely had the most depth I've seen in a mapping algorithm. It's weird in that it's almost that if you stand farther away you can see the room better.

http://graceavery.com/deepdreams/flickr_norm1_0013.jpg


> It's weird in that it's almost that if you stand farther away you can see the room better

That's because lower spatial frequencies from the source image are preserved much more completely than higher frequencies. When you blur your eyes or stand further away, you are filtering out the higher frequencies in the image you see. Since the noise is disproportionately at higher frequencies, reducing higher frequencies increases the SNR so you're able to make out more.

This illusion demonstrates the effect much more starkly: http://i.imgur.com/R4WI769.jpg


Try interactive version with one or two objects from the object list at Twitch.tv http://www.twitch.tv/317070

"Instead of using it for classification, we are showing it an image and asking it to modify it, so that it becomes more confident in what it sees. This allows the network to hallucinate. The image is continuously zooming in, creating an interesting kaleidoscopic effect."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: