Startup Mimics Human Eye By Adding Processing to Pixels

//php echo do_shortcode(‘[responsivevoice_button voice=”US English Male” buttontext=”Listen to Post”]’) ?>

An early-stage company spun out of Johns Hopkins University wants to make machine vision more like human vision by adding memory and computing to each sensor pixel. Oculi is developing products for gesture recognition and eye tracking in consumer AR/VR systems. Other applications include smart city infrastructure and eventually, automotive vision sensing.

Beyond buzz over existing event-based vision sensing frameworks, Oculi CEO Charbel Rizk told EE Times there’s plenty of room for innovation elsewhere.

Oculi Charbel Rizk
Charbel Rizk (Source: Oculi)

“The problem that we’re running into right now with machine vision is that we’re using sensors and processors that were developed for different purposes, putting them together and thinking if we throw enough processing downstream we solved the problem,” Rizk said. “That’s not the case, because the problem really starts at the sensor. Machine vision is not about pretty images. It really should be about efficiency: How do we get the information in an efficient way?”

Oculi’s sensing and processing unit (SPU) chip is based on integrating both capabilities at the pixel level, similar to how the human eye works. An Oculi pixel includes a sensor alongside digital processing – logic and a small memory – making the pixel smart enough to deliver information if it detects something of interest. A full frame output is still possible if the application requires it; the user can choose from different output modes to optimize privacy, latency and power consumption via software. For most applications, the SPU runs on milliwatts of power, Rizk added.

Similar to existing dynamic vision sensors that use event-based vision, Oculi’s sensor can output events that detect changing pixel data. Rizk said this event type lacks the efficiency needed for emulating human vision.

“There are times when you’re not looking for changes, you’re looking for something in the scene,” he said. “Event sensors out there today don’t give you any information at that point, they become blind. There are times when you do want the full frame… so we built our architecture to allow you to get all these outputs using software.”

Oculi vision sensor SPU
Oculi’s SPU combines vision sensing, processing and memory at the pixel level. (Source: Oculi)

With every pixel capable of some basic computation, algorithms can be implemented on the SPU without external processing. On top of full frame images and events, that allows two additional forms of output.

One is “smart events,” using less than 10 percent of bandwidth compared to a full-frame image, but containing sufficient information for an application. Smart events can also be based on color or depth sensing (using two SPUs for high-speed stereo vision). Crucially, smart events vastly reduce signal noise compared to basic event-based vision; memory in each pixel means consistency can be evaluated over multiple frames to help eliminate noise. Bandwidth is also reduced when compared to purely event-based vision.

A successful field test of Oculi’s sensor in Chicago counted axles of passing vehicles for electronic toll billing; the test also estimated speed using smart events.

Oculi vision sensor
Possible outputs from Oculi SPU (L-R): Full frame image, events, smart events, actionable data (top row is number of vehicle axles, bottom row shows the hand gesture “swipe right”). (Source: Oculi)

The other output, “actionable information,” reduces bandwidth even further by processing smart events on the SPU using pattern recognition techniques. For some applications, further vision processing is not required.

For example, Oculi hardware deployed as part of a smart city infrastructure field test was reprogrammed at the customer’s request to function as a flash-flood alert system. The sensor was calibrated to count raindrops falling in front of a camera to estimate precipitation. (This was done by identifying the distinctive size and motion of raindrops). Computing rainfall estimation was accomplished entirely on the SPU.

Gesture recognition

Oculi was spun out of Johns Hopkins University in 2019 based on Rizk’s academic work, including multiple generations of test chips. The startup is in the process of closing a seed funding round while negotiating with potential foundry partners.

While the technology was originally focused on military applications (early demonstrations detected muzzle flashes), Oculi is now pursuing both consumer and automotive applications. For now, the company is targeting gesture recognition and eye tracking in consumer AR/VR systems. AR/VR vendors want to eliminate handheld remote controls and conserve battery power in headsets. Oculi is also working with automotive manufacturers on future ADAS/AV opportunities. Smart city infrastructure, facial recognition and person detection are also being considered.

Oculi’s roadmap includes a product family with varying levels of on-chip processing capability. Future devices could add AI capabilities on-chip. Engineering samples on demo boards (single and dual/stereo SPUs) and a software development kit are available now.

Source link

We will be happy to hear your thoughts

Leave a reply

Enable registration in settings - general
Compare items
  • Total (0)
Shopping cart