I used to work on automatic driving. I ran Team Overbot in the 2005 DARPA Grand Challenge. (We lost, but we didn't crash into anything.) So I'm painfully aware of the problems of automatic driving.
You need to be both looking at the road with cameras and profiling it with LIDAR. (Or terahertz radar, once that gets going.) It's not enough to just sense the car ahead. You need to be able to detect potholes, ice patches, junk on the highway, small animals, and similar problems. We could detect and avoid potholes back in 2005. Since we were doing off-road driving, that was a normal driving event.
The reason for a high-view LIDAR is that you want to see the pavement surface ahead from a reasonably useful angle and get a 3D profile of the road ahead. Google uses the Velodyne spinning-cone LIDAR scanner, which is a lot of LIDAR units built into one rotating mechanism. That's a research tool. There are other LIDAR devices more suited to mass production. Advanced Scientific Concepts has a nice eye-safe LIDAR which can operate in full sunlight. It costs about $100K, but that's because it's made by hand for DoD and space applications. The technology is all solid state, not inherently that expensive, and needs to be made into a volume product. (Somebody really needs to get on that. In 2004, I took a venture capitalist down to Santa Barbara to meet that crowd, but there was no mass market in sight back then. Now there is.)
You can only profile the road out to a limited distance, regardless of the sensor, because you're looking at the road from an oblique angle. Under good conditions, though, you can out-drive the range at which you can profile the road. That was Sebastian Thrun's contribution, and won the DARPA Grand Challenge. The idea is that if the LIDARs say the near road is good, and the cameras say the far road looks like the near road, you can assume the far road is like the near road and go fast. If the far road looks funny, you have to slow down and get a good look at the road profile with the LIDARs.
Automatic driving systems have to do all this. "Driver assistance" systems don't. Hence the "deadly valley".
That's just to deal with roads and static obstacles. Then comes dealing with traffic.
Humans proved that for current level driving two not particularly high-resolution cameras are sufficient. Seems like pushing in this direction will remove this expensive component?
Humans, also posses a listening system, balance system, a highly advance pattern recognition system filled with auto complete from a huge database of pictures (which to this date hasn't been replicated - face recognition doesn't count it needs to recognize cars, signs, people, animals, pavement, trees, obstacles, etc.), not to mention knowledge of various possible scenarios, various models of how their body/car/traffic works, etc.
You get to use worse hardware, but you need several order magnitude better software.
> a highly advance pattern recognition system filled with auto complete from a huge database of pictures (which to this date hasn't been replicated - face recognition doesn't count it needs to recognize cars, signs, people, animals, pavement, trees, obstacles, etc.)
I expect that, after the first wave of clumsy LIDARing self-driving cars, all the car companies (Google especially) will be collecting exactly training data from the cars' sensors to build exactly this kind of model. In fact, I wouldn't be surprised if that was what the Google car was really about, in the same way Google Voice is really about collecting speech training data.
The best part of this kind of training data is that it all comes pre-annotated with appropriate reinforcements: even if the image-recognition sensors aren't hooked up to anything, they're coupled to the input stream from the other car sensors and the driver's actions. So you would get training data like
- "saw [image of stopsign], other heuristically-programmed system decided car should stop, driver confirmed stop."
- "saw [image of kitten standing in the road], other heuristically-programmed system decided car should continue, driver overrode and stopped car."
Etc. Aggregating all these reports from many self-driving cars, you could build an excellent image-to-appropriate-reaction classifier.
Yes, but with voice data it's ok if the system gets it wrong occasionally. Worst-case scenario is the user gets annoyed and tries again (or gives up and does something else).
In a driving situation, the worst-case scenario is everybody dies.
I would guess that the processing power is all that matters. It's not difficult or particularly dangerous to drive without being able to hear. I would guess that people driving remote controlled cars with 360-degree views but no other cues would perform very nearly as well as real drivers.
The human eye can instantly recognize the available driving paths, the motorcyclist ahead, and project where people will walk. Software would have to parse out where the open roads are, how far that motorcyclist is and whether he can clear the intersection before the car reaches it, and what that sign on the right-hand side is—using the same information, but it has to parse it first whereas we do that almost instantly. It's a totally different game.
Yes, that's what I'm saying. I'm saying the other sensors the parent post mentioned weren't actually important with regards to driving, just our ability to parse the visual data into a meaningful model of the world around us.
I think hearing is also useful, from time to time. It's not AS critical as sight, but if nothing it allows drivers to share their emotional state in a very primitive way and to gauge how their engine is performing.
While driving, humans assume that the road ahead is OK (at least without significant potholes, and in sunny hot weather no ice-patches). We would expect better of robots (i.e. if a human crashes because of an oil patch, (s)he's a bad driver; if a computer crashes, it's a million-dollar lawsuit).
Edit: a better solution would be to observe the behaviour of other drivers; if there is someone driving ahead of you, you can assume that the road between you and them is OK; if there's noone ahead of you, you need to drive slower and be more careful (that's how I drive at night). Once there's a critical mass of cars with cameras, cars could communicate road conditions automatically.
> Once there's a critical mass of cars with cameras, cars could communicate road conditions automatically.
I would be frightened to trust the data coming from a random car in front of me. Inferring road condition from another car behavior sounds reasonable, using data supplied from it, not so much.
You shouldn't and wouldn't rely unfailingly on what other cars merely report. If the car in front of you insists its maintaining speed while your own readings indicate its slamming on its brakes, you should assume it's slamming on its brakes.
However, if the car three cars in front of you just broadcast "I'm doing an emergency stop right now", that's really valuable data. The human in your car won't know anything is wrong for at least a second. The human driver behind you would know about it before the human driver in front of you.
That will be the most common failure mode for computer-driven cars: how easy they are to bring to a stop. (And, yes, that's probably criminal behavior.)
A computer-driven car, though, wouldn't (shouldn't) just immediately slam on the brakes because of that signal. It would tighten seat belts and start slowing down, but it also would want to avoid getting rammed by the car behind it. It can make very accurate estimates about its stopping distance and use all of it.
The internet works that way, and you seem to be fine with that. :\ Is it the "saftey issue"? i.e the internet can't crash you into a wall, it can only send you to rotton or steal your credit card...
Wow, no. I don't trust the internet to give me real facts about elephants, let alone anything life-threatening. http://en.wikipedia.org/wiki/Wikipedia:Wikiality_and_Other_T... If the equipment had some built-in tamper detection, and Google's sensors digitally signed data if they didn't detect tampering, then I might trust it enough to drive with.
"Seeing" and "Perceiving" are likely very different. Yes, we only have binocular visual input, but the excess in processing in the brain takes perception to another level. However, machines have trouble with the perception part and so have to make up for it by seeing in excess.
The simple answer is that we don't know how to do it with stereo vision alone yet. Getting range reliably is hard, and your brain uses lots of tricks to do it.
The second answer is that we need better-than-human performance if this is to take off. So using human-type sensing might not be good enough anyway.
Lots of researchers are pushing on vision-based driving, though.
Human eyes are actually equivalent to very high end video cameras, and the image process that you can do in your squishy grey 10 watt processor is still way better than anything we can do with computers. You need your navigation system to be able to directly sense in 3 dimensions for it to be competitive.
Not really: we have hi dpi (and in focus) resolution only in the narrow field of view in the middle, everything else is not that good. We compensate for this by having ability to quickly move eyes and refocus.
Well, it's already in consumer-level (ok, "prosumer") cameras. Also we know that "fps" is of an eye is around 60 Hz - since it's the minimum that monitors looks OK.
When you can fit an exaflop electronic computer into a car, then maybe two cameras would be sufficient. Right now, we have to make do with less, and better sensors can make up the difference.
That's a rough estimate of how much computing power is in the human brain. It's extremely efficient energy-wise, but massively parallel and weirdly put together so not entirely comparable to an electronic computer. Still, the computing resources available to process the images from the human eye are enormous.
Oh man, the vulnerabilities of using likely untrustable networked sensors for safety critical operations boggle the mind. While the sensors theoretically could be made trustable, for some level of trust, I would be very wary of trusting at the level of safety critical application. Inadvertent vulnerabilities and attacks by malicious actors would be catastrophic.
My prediction: "crashdummy" will eclipse "heartbleed" and "shellshock"!
I think they're not that far away. Google is talking about starting with small self-driving cars with a top speed of 25mph or so. (At that speed, you don't have to drive out of trouble; an emergency stop is sufficient.) I expect those will be common in retirement communities in 10 years or less.
Tesla is talking about automatically putting their car into a garage. That's a good application; it's slow, and you can have sensors all around the car to avoid hitting anything. A more general system that can put a car into a big parking garage or lot is quite possible. Cooperating parking garages might have some additional bar-code markers and maybe a data link for open space info.
The whole airport car-rental thing could be done automatically, using slow-speed automatic driving to bring the car up to a pickup point at the terminal, just as the renter gets there. That may be one way this gets deployed. (I proposed that around 2003, but after 9/11, the idea of autonomous vehicles in an airport seemed politically hopeless.)
Those are some ways this might be deployed. Everyone has obsessed on automatic freeway driving since the 1950s, but that may not be the killer app.
I just had a vision of a youtube video from the near future of some slow driving cars on automatic parking mode getting stuck in loops against each other. Somewhere in the background, a dog barks.
I remember reading an article years ago about automatic driving first coming to commercial shipping (i.e. 18 wheelers and other cargo vehicles) using a separated lane on highways, but that the political pushback (jobs lost) was making that a difficult sell.
Did you all have to do work with navigating through snow and heavy rain? I was under the impression that snow is a big challenge and heavy rain isn't much better. Does that still hold true today?
It is one thing to anticipate threats, but when the road is under snow how do computers make that intuitive leap people can?
The duty cycle on LIDARs is very low. If you're ranging to 200 meters, the receiver is only taking data for 1.2us. At 60 Hz scanning, the receiver is active for 72us/sec, or 0.0072% of the time. So in the presence of 100 other transmitters (worst case), you'll get a conflict 0.72% of the time. If the transmit time is randomized slightly (which I don't think Velodyne does, but a production device must), you won't get the same bogus reading twice in a row. Over three readings, if you throw out outliers, this problem should go away.
<p>
If the LIDAR data has too many outliers, it's necessary to slow down and only use data from short ranges. At some short range, the LIDAR will "burn through" any jamming from a more distant range, per the radar equation. I agree that on production vehicles, anti-jam software, as described above, will be necessary.
My experience says probably not. LIDARs are very directional at any given instant, and you need to filter out outliers anyways thanks to things shimmering in the sun.
You need to be both looking at the road with cameras and profiling it with LIDAR. (Or terahertz radar, once that gets going.) It's not enough to just sense the car ahead. You need to be able to detect potholes, ice patches, junk on the highway, small animals, and similar problems. We could detect and avoid potholes back in 2005. Since we were doing off-road driving, that was a normal driving event.
The reason for a high-view LIDAR is that you want to see the pavement surface ahead from a reasonably useful angle and get a 3D profile of the road ahead. Google uses the Velodyne spinning-cone LIDAR scanner, which is a lot of LIDAR units built into one rotating mechanism. That's a research tool. There are other LIDAR devices more suited to mass production. Advanced Scientific Concepts has a nice eye-safe LIDAR which can operate in full sunlight. It costs about $100K, but that's because it's made by hand for DoD and space applications. The technology is all solid state, not inherently that expensive, and needs to be made into a volume product. (Somebody really needs to get on that. In 2004, I took a venture capitalist down to Santa Barbara to meet that crowd, but there was no mass market in sight back then. Now there is.)
You can only profile the road out to a limited distance, regardless of the sensor, because you're looking at the road from an oblique angle. Under good conditions, though, you can out-drive the range at which you can profile the road. That was Sebastian Thrun's contribution, and won the DARPA Grand Challenge. The idea is that if the LIDARs say the near road is good, and the cameras say the far road looks like the near road, you can assume the far road is like the near road and go fast. If the far road looks funny, you have to slow down and get a good look at the road profile with the LIDARs.
Automatic driving systems have to do all this. "Driver assistance" systems don't. Hence the "deadly valley".
That's just to deal with roads and static obstacles. Then comes dealing with traffic.