Hearst, AlphaPrime, ENIAC and Samsung Next Talk Opportunities in Visual Tech Investing

At the LDV Vision Summit 2017, Erin Griffith of Fortune spoke with Vic Singh of ENIAC Ventures, Claudia Iannazzo from AlphaPrime Ventures, Scott English of Hearst Ventures and Emily Becher from Samsung Next Start about trends and opportunities in visual technology investing.

Watch their panel discussion to learn more:

Our fifth annual LDV Vision Summit will be May 23 & 24, 2018 in NYC. Early bird tickets are currently on sale. Sign up to our LDV Vision Summit newsletter for updates and deals on tickets.

LDV Capital Raises $10M Second Seed Fund for Visual Technologies

Evan Nisselson, General Parter & Founder of LDV Capital © Ron Haviv

Evan Nisselson, General Parter & Founder of LDV Capital © Ron Haviv

We are very excited to announce the close of our second fund for investing in people building visual technology businesses at the pre-seed or seed stage. You can read more about it on the Wall Street Journal. Our press release is below. Also check out our Jobs page to learn more about the exciting new roles available with us at LDV Capital.

Press Release -- LDV Capital, the venture fund investing in people building visual technology businesses, today announced a new $10M seed fund. It is the second fund for the thesis-driven firm that specifically invests in deep technical teams that leverage computer vision, machine learning and artificial intelligence to analyze visual data.

Investors in this second fund include top technical experts in the field including Mike Krieger, Instagram Co-founder/CTO and Steve Chen, YouTube Co-founder/CTO. Other investors came from family offices, fund-of-funds, an endowment, a sovereign wealth fund, and more.

“Because of their domain expertise and leadership in visual technology, LDV Capital is at the forefront of innovations in the space. They invest in and empower technical founders with the greatest potential for harnessing the power of computer vision to disrupt industries. The opportunities are tremendous.” Mike Krieger, Instagram, Co-Founder & Director of Engineering.

"Capturing and analyzing visual data with the aid of computers create a paradigm shift in the approach to content. I believe LDV Capital helps founders grow companies at the helm of this evolution." Steve Chen, Youtube, Co-Founder & CTO.

LDV Capital investments at the pre-seed stage include Clarifai - an artificial intelligence company that leverages visual recognition to solve real-world problems for businesses and developers, Mapillary - delivering street-level imagery for the future of maps and data solutions, and Upskill - delivering augmented reality solutions for the industrial workforce. They have assisted their portfolio companies in raising follow-on capital from Sequoia, Union Square Ventures, NEA, Atomico and others.

“Visual technologies are revolutionizing businesses and society,” says LDV Capital General Partner, Evan Nisselson, a renowned thought leader in the visual tech space. “By 2022, our research has found there will be 45 billion cameras in the world capturing visual data that will be analyzed by artificial intelligence. Our goal is to collaborate with technical entrepreneurs who are looking to solve problems, build businesses and improve our world with that visual data.”

LDV’s horizontal thesis spans all enterprise and consumer verticals such as: autonomous vehicles, medical imaging, robotics, security, manufacturing, logistics, smart homes, satellite imaging, augmented/virtual/mixed reality, mapping, video, imaging, biometrics, 3D, 4D and much more.  

Every May, LDV Capital hosts the two-day LDV Vision Summit in NYC known to top technologists, investors and entrepreneurs as the premier global gathering in visual tech. The fifth annual LDV Vision Summit will be May 23 and 24, 2018. Since 2011, LDV Capital also holds invite-only, gender-balanced monthly LDV Community dinners that bring together leading NYC entrepreneurs and investors to help each other succeed. Both are part of their LDV Platform initiatives.

LDV Capital is one of the growing number of single GP funds, founded by Nisselson in 2012 after building four visual technology startups over 18 years in Silicon Valley, NYC and Europe.  The firm boasts an exceptionally strong expert network with their experts-in-residence including computer vision leaders such as Serge Belongie, a professor of Computer Science at Cornell University who also co-founded several companies and Andrew Rabinovich, Director of Deep Learning at Magic Leap, and Luc Vincent, VP of Engineering at Lyft and Gaile Gordon, Vice President Location Products at Enlighted.

Find out more about our open opportunities on our Jobs page.

Building an MRI Scanner 60 Times Cheaper, Small Enough to Fit in an Ambulance

©Robert Wright/LDV Vision Summit

©Robert Wright/LDV Vision Summit

Matthew Rosen is a Harvard Professor, he and his colleagues at the MGH/A.A Martinos Center for Biomedical Imaging in Boston are working on applications of advanced biomedical imaging technologies. At the LDV Vision Summit 2017 he spoke about how he is hacking a new kind of MRI scanner that’s fast, small, and cheap.

It's really a pleasure to talk about some of the work we've been doing in my laboratory to revolutionize MRI, not by building more expensive machines with higher and higher magnetic fields, but by going in the other direction. By turning the magnetic field down and reducing the cost, we hope to make medical devices that are inexpensive enough to become ubiquitous.

MRI is the undisputed champion of diagnostic radiology. These are very expensive, massive machines that are really confined to the hospital radiology suite. That's due, in large measure, to the fact that they operate at very high tesla strength magnetic fields. If you imagine taking an MRI scanner and putting it in an environment like a military field hospital, where there may be magnetic shrapnel around, you could really injure someone or worse.

Our approach is to go all the way down at the other end of the spectrum, at around 6.5 millitesla, roughly 500 times lower magnetic field than a clinical scanner, and I'll talk about work we've done in a homemade scanner that's based around a high performance electromagnet, with high performance linear gradients for spatial encoding. You can't really just turn the magnetic field down of an MRI scanner and expect to make high quality images. This really comes down to the way we make measurements in MRI.


LDV Capital is focused on investing in people building visual technology businesses. Our LDV Vision Summit explores how visual technologies leveraging computer vision, machine learning and artificial intelligence are revolutionizing how humans communicate and do business.

Early Bird tickets are now available for the LDV Vision Summit 2018 to hear from other amazing visual tech researchers, entrepreneurs, and investors.

We are accepting applications to our Vision Summit Entrepreneurial Computer Vision Challenge for computer vision research projects and our Startup Competition for visual technology companies with <$2M in funding. Apply now &/or spread the word.


We use inductive detection. This is something you're all familiar with as a child, where you take a magnet and move it through a loop of wire, and you generate a voltage. In this case, the moving magnet actually comes from the nuclear polarization of the water protons typically in your body, and, in fact, what Richard Ernst calls "the powers of evil" has to do with the fact that nuclear magnetic moments are very, very small.

If you're interested in making images of this quality, over the span of a few seconds or minutes, it means you need to make this multiplicative term B, the magnetic field, very, very high. That means that all clinical scanners operate, in the tesla range, typically around 3 tesla. Knowing that, what sort of images do you think we'd be able to make at our field strength, roughly 500 times lower magnetic field, which is a calculated SNR of around 10,000 times lower? Well, you'd probably guess that we couldn't make very good images, and, in fact, you'd be right.

Up until a few years ago, these were the kind of images we were making in our scanner. This is, in fact, a human head, if you can believe it. It's a single slice, took about an hour to acquire, and nobody was very interested in this at all.

If this is all we had, I wouldn't be here today, so let me tell you how we solved these problems.

Really, how do you solve a hard problem? What we've been working on is a suite of technology, half of it based in physics, half of it based in the availability of inexpensive compute. The physics applications are really about improving the signal strength, or the signal-to-noise, coming out of the body and into our detectors, and then the compute side is really about reducing the noise or getting more information from the data we have, or fixing it in post, as some people in this audience might call it.

Let's start really at the beginning, our acquisition strategy. The way you do NMR, or at least the way we do NMR at, remember, very, very low magnetic fields with very, very low signals, is we take our magnetic field. We turn it on. In red is that very small nuclear polarization I talked about. We apply a resonant radiofrequency pulse. We tip the magnetization into the transverse plane, and then we apply a series of coherent radiofrequency pulses to drive that magnetization back and forth very, very rapidly.

Then, again analogous with this inductive detection approach, we detect our signal, but not using a giant hand and a magnet moving, but instead using a 3D printed coil, in this case around the head of my former colleague, Chris LaPierre, to detect this very, very small, but with a very high data rate signal. We call this Balanced Steady-state Free Precession. That's a bunch of words. What it really means is that we now have an approach to very rapidly sample this, although very, very small, signal coming from the head.

What this has allowed us to do is to make images like this. In six minutes, we can make a full 3D dataset, roughly 2.5 millimeter in plane resolution, 15 slices. Just remember, this is the same machine, okay? The difference between these images has to do with the way we interrogate the nuclear spins, the fundamental property of the body of water, in this case, and the way we sample it. That's pretty nice. Having a high data rate actually allows us to now build up even higher quality images by averaging, and those are some images shown here, but there are other approaches, and this is where we start really talking about compute.

Pattern matching is an interesting approach people are very familiar with in the machine learning world, but we all know about this from basic physics. As an example, think of curve fitting, which you could think of as pattern matching. Curve fitting, you have some noisy data, shown as open circles here. You have some model for the way that data depends on some property, say time, so you take your functional form. You fit that function to the data, and you extract not only the magnitude of the effect, but also additional information, in this case a time constant of some NMR CPMG data.

The MRI equivalent of pattern matching is known as magnetic resonance fingerprinting. In contrast to what we did above, where we add up all of these very noisy images to make a higher quality image, in this case, we don't average actually. We just acquire the raw data. You see the data coming in to the lower left. These are very, very noisy, highly under-sampled images that normally you would sum together. The interesting thing we do here is we sort of dither the acquisition parameters a little bit. In the upper left, we show exactly how much we tip the magnetization, and in the upper right, we vary a little bit about the time in between individual acquisitions.

©Robert Wright/LDV Vision Summit

©Robert Wright/LDV Vision Summit

What do we do with this data? Well, here is one of those images. I'll plot the time dependence of the signal. We call that the fingerprint. Why do we call it a fingerprint? Well, very much analogous with the partial fingerprint, smudged fingerprint, you might find at a crime scene, there are lots of ridges and valleys and things that distinguish that information or that fingerprint. If you were trying to identify who this fingerprint belonged to, you would search your database, and then you would find, hopefully, a match, which gives you not only the complete fingerprint, which is interesting, but actually it's tied to a record, in this case, my collaborator, Chris Farrar.

What we do in this case, for the MRI equivalent, is we take our MRI fingerprint. We search a database, in this case, of precomputed NMR trajectories, which is the physics that defines how these magnetization depends as a function of time. We find our best match in red. That tells us not only the intensity, M0, of the signal at that particular pixel, but also other parameters, which in this case tell you about the local magnetic environment, both of the machine and of the body.

What does this compute-based pattern matching approach do for our data at low field? Well, in addition to giving us images, like on the first line, which are very similar to the last images I showed you, we get all of this additional information for free. In this case, it's quantitative information, again about the local magnetic environment of the tissue, so-called T1 and T2, as well as properties of the instrument and the local magnetic fields. Okay, so really compute with our noisy data--ah, thank you--compute with our noisy data actually allows us to have more information than we would get with a standard approach.

The last thing I want to talk about, really, is something that my collaborator talked about early on, which is a new thing that we've only talked about publicly for about a month, which is the idea of is there something to be learned from natural vision?

It comes down to a very interesting point, which is that the brain is really, really good at taking noisy data, especially in low light, and doing pattern matching on textures and edges, and at low field, we generate noisy data all day long, so can we take that low SNR data and process it in a framework that's based around the way the retina handles data through the neuronal currents into the reconstructing of a final image through perceptual learning, which is a data-driven, lifelong approach? Can we analogously build a way of handling the voltages coming out in our NMR coil and the actual data to reconstruct images, using a similar data-driven training approach?

©Robert Wright/LDV Vision Summit

©Robert Wright/LDV Vision Summit

We call that AUTOMAP, which is automated transform by manifold approximation. It's broader than MRI, but I'll talk about it specifically in this case. It allows us to recast image reconstruction as a supervised learning task. In this case, we train up a joint manifold. One manifold consists of the data, the voltages coming in from the scanner itself, and then the other manifold is the image representation of that.

The reason to do this, and we've built it up as a deep neural network. The reason to do this is that we can take that matrix of sensor data, those voltages coming in again from that inductively ... We're talking macroscopic things here, right? A coil wrapped around the head of a person in a magnetic field. Put that data in on the left side of this. Out comes a reconstructed image, and the reason it does a good job, as I'll show you, is because it not only subsumes the mathematical transform between the sensor data and its final data, but it also takes advantages of properties of natural images, such as image sparsity.

Here's very quickly some examples of this. This is radially sampled MRI data, an SNR of around 100, and the conventional reconstruction, which is a complicated iterative reconstruction looks like this. The same data, fed into AUTOMAP, reconstructs like this. I'm not just cherry picking. It doesn't matter what acquisition strategy you use here. In all cases, you get superior immunity to noise, using this neural network based approach to reconstruct these raw voltages into images.

The interesting thing about this, like all supervised learning approaches, is that it can learn any encoding. That makes it relevant beyond MRI, but also in MRI, because there is a whole zoo of acquisition strategies that people use. This really reminds me of the Google DeepMind Atari Breakout program, right? Where that's interesting, if you've seen this before. A neural network was taught to play Breakout, which is interesting enough, but actually, if you watch this for a while, you'll see that the neural network pretty quickly learned a really good acquisition strategy for playing the game, where it runs the ball up one side and uses the back wall to maximize its points.

Think about that for a minute. Are there optimal ways of sampling this data that we just haven't thought of? You can see, actually, all these encodings shown on the left side are geometric, radial, spiral, Cartesian. That's because we're logical people, and we think about things in terms of geometry, right? But if you let all the parameters run, you can imagine doing much, much better.

In conclusion, I've shown that MRI is possible outside the scanner suite, through a combination of physics and compute, both sensors and sequences, as well as these fingerprinting approaches and AUTOMAP. Now, what are the implications for health care?

Well, fortunately, as you can see the scanner in the upper right, we are not limited to the existing footprint of our test system. The physics and the compute are basically length and variant. They scale. You can build a smaller scanner that takes advantages of a lot of the innovations we've developed. It's really built around the idea of using inexpensive hardware with scalable, mostly GPU-based compute.

The question really is what is the clinical implication of time and resolution, because there's a trade-off between them. Our images will never be as good as a 3 tesla scanner. That's just physics, okay? But every day, clinicians make a decision between speed, specificity, resolution, and cost in medical imaging and in health care. A really good example of a highly optimized version of that is the stethoscope, right? That's a $50 object. Its resolution is like this, if you even want to think of it as having a resolution, but in the hands of a clinician, it can tell if someone has a pneumonia or a cardiac arrhythmia.


Imagine if you could use the MRI scanner as a ubiquitous tool, say that's in a CVS Minute Clinic, military field hospital, sports arena, neuro ICU, chronic care conditions, or at home, monitoring, say, long-term effects of chemotherapy.


Imagine if you could use the MRI scanner as a ubiquitous tool, say that's in a CVS Minute Clinic, military field hospital, sports arena, neuro ICU, chronic care conditions, or at home, monitoring, say, long-term effects of chemotherapy. As long as the cost becomes low enough, and this metric of time versus resolution is positive, net positive, I think it's a really useful tool. This really reminds of, of course, everyone's favorite scene from Wall Street, right?

This is the first time pretty much anyone saw a cellphone, and that was sort of neat, but the cellphone, of course--and this audience knows this very clearly--the cellphone has become useful, because it's ubiquitous. Everyone has one, and that's led to new ways of connecting between people.

Imagine what you can do, just adding layers of data mining and health care and telemedicine on top of the idea of these ubiquitous sensors. With that, I want to acknowledge my group members, both past and present, and, of course, our funding agencies, and you guys for listening. Thanks so much.

We are accepting applications to our Vision Summit Entrepreneurial Computer Vision Challenge for computer vision research projects and our Startup Competition for visual technology companies with <$2M in funding. Apply now &/or spread the word.

JW Player, VH1 and Tout Discuss What's On TV Now

At the LDV Vision Summit 2017, Rebecca Paoletti of Cake Works spoke with Brian Rifkin of JW Player, Michael Downing of Tout and Orlando Lima from Viacom/VH1 about how traditional TV broadcasters are becoming digital while digital platforms are investing in traditional television models and programming.

Listen to their thoughts on creating valuable branded content, applying machine learning, and much more: 

 

Our fifth annual LDV Vision Summit will be May 23 & 24, 2018 in NYC. Early bird tickets are currently on sale. Sign up to our LDV Vision Summit newsletter for updates and deals on tickets.

How Professor Ira Kemelmacher-Shlizerman Built Dreambit & Sold it to Facebook

© Robert Wright/LDV Vision Summit  

© Robert Wright/LDV Vision Summit
 

Ira Kemelmacher-Shlizerman is a professor at the University of Washington and a Research Scientist at Facebook. At our LDV Vision Summit 2017 she spoke about how and why she evolved her research into a company called Dreambit which she then sold to Facebook.

I'm supposed to give a talk about how to combine academia with industry and so, I chose to do it by telling three stories. I'm both a research scientist at Facebook and a professor at University of Washington. And so, I will make the talk super over simplified and humorous. 

Story number one, I go to The Weizmann Institute of Science to be a grad student and my advisor is Ronen Basri. The first problem that I decided to work on is, I want to be able to take a single photo of a person and reconstruct it through the 3 dimensional shape of her face. It kind of makes sense that we should be able to do it, because as humans when we look at the face - based on the shading, based on the prior knowledge of faces that we have, we can imagine how she looks from different sides, just from a single photo. So, I wanted to create an algorithm that can do it automatically. And it didn't exist when I started my grad studies, so I thought it's a worthwhile problem.

I worked on it and we created a math for doing it from a single photo, and the cool part about the math is that I could apply it to anything. I could reconstruct Mona Lisa, I could reconstruct Clint Eastwood and so on. And we published a paper and compared reason at conference, and I was so, so excited. Here's the business part. 
 


LDV Capital invests in deep technical people building visual technologies. Our LDV Vision Summit explores how visual technologies leveraging computer vision, machine learning and artificial intelligence are revolutionizing how business and society.

Early Bird tickets are now available for the LDV Vision Summit May 23 & 24, 2018 in NYC to hear from other amazing visual tech researchers, entrepreneurs and investors.


I wanted to show the results so I went to my family, to my husband Elliott and my brother Mike and I said, "Check this out! The results I got are so cool." And we started talking and we came up with some business ideas, it was 2006. We said "oh, it can be used in Second Life, and it can be used as avatars in games and all that."

And my brother said "I know this guy, that knows this VC that is going to come from the US next week, should we just talk with him?" And I said "Yes, sure. Let's do it." And so, the VC came from the US and we pitched to him the idea, and he said, "well, that sounds cool. Here's half a million, let's make a business out of it.” And so we were so excited, that we went to celebrate, we were like "oh my gosh this is a paper, we can do a business, a startup". So we went to celebrate in Mexico.


"If you want to swim with the sharks, you have to swim faster - you do not do vacations in Mexico."


While in Mexico, we were doing Skype conversations with a vista firm and negotiating and so on, and at one point they ask us, "where are you?" And we were like, "we're in Mexico". And they freaked out, and I still remember the quote that he said. He said "If you want to swim with the sharks, you have to swim faster - you do not do vacations in Mexico." At that point, it seemed like “maybe I'm not ready to swim with the sharks quite yet,” and it also didn't help that they wanted more than 51 percent of the company. The non-existent business. So we decided not to create a company, learned a bit about VCs, wrote more papers, and I got my PhD. That was exciting.

© Robert Wright/LDV Vision Summit

© Robert Wright/LDV Vision Summit

Story number two, I finished my PhD, I wanted to do my postdoc at University of Washington. I came to Seattle working with Steve Seitz. The first problem that we wanted to solve, I said, "Okay, so I've been working on this single photo idea of reconstruction, but actually every one of us has so many photos out there. And so it's not just one photo, we're going to have bigger and bigger collections, so wouldn't it be amazing just to visualize those big collections somehow?"

What existed at that time was just slideshows, right? And this is just a random showing of photos, not super exciting. So I started playing with big collections, and I found out that if I focus on the person and just align by the location of the eyes, I already get a really cool effect. I kind of see her grow in front of my eyes. And there is something interesting about visualizing the person through photos.

Eventually we thought it was really cool, and we developed the algorithm further. It was taking into account facial expressions and the head pose and so on. And I started showing those results to Steve, and he loved everything. So, then at that time, it was 2010 or 2011, Steve went to spend time with Google, and he said, "hey this looks so cool and practical, how about I'll show it to my boss at Google."

The boss at Google said, "This is interesting, but let's see on my daughter’s photos." So I did it, I tried it out on his daughter’s photos, and he liked it. Then, I went to spend half a year at Google and with an amazing team we did ship it. The final product was, with the click of a button you could create face movies.


We are accepting applications to our Vision Summit Entrepreneurial Computer Vision Challenge for computer vision research projects and our Startup Competition for visual technology companies with <$2M in funding. Apply now &/or spread the word.


That was exciting, I learned how to make a product, and the product was used by millions. I wrote more papers, I finished my postdoc, and I got a faculty job. I went on the academic market, super competitive, but actually my experience in industry, plus all the papers helped me to get really cool jobs. And it helped me to get my dream job, where I could stay as faculty at UW, same place where I did my postdoc.

Story number three, I'm at University of Washington, but now as a professor. And I established my own group, I have students and we work on all sorts of cool projects and publish papers and so on, but in my free time, kind of as a joke, but I'm deeply concerned about a problem. I want to see will black hair fit me? But I don't want to go and dye my hair before I know if it looks good.

I started kind of as a toy project to render myself with black hair to see how it may look like, and then I continued to render myself with curly hair. And I started building this system, and I built it in a way that it looks like an image search engine, where I could type anything, for example, "India", and imagine how I would look. Or I could even go back in time and type "1930" and imagine how I would look in the 1930's. I kept going and could type any query, different hairstyles, and colors, shaved and traditional, clothing and so on.

© Robert Wright/LDV Vision Summit

© Robert Wright/LDV Vision Summit

I published a paper about it in SIGGRAPH, it was a single author paper, and it got in, it got accepted. It looked a success story, but I kept working on it and my husband was like, "why are you working on that so hard?" And I said, "I don't know it just seems like a cool, I feel like there is a business around it, I want to establish a company." And so he says, "It does seem exciting, so how about we do it together?" And I said, "Yeah, let's do it." So, we created this company, immediately became the CEO and the CTO and after our kids would go to sleep, we would code.

We bought a ton of equipment to put in our basement and we created a real time system that lets you do what I just described and we're ready to let people in to try it. SIGGRAPH came, and I gave a demo during a talk with them and a bunch of companies, big companies, got interested. SIGGRAPH is known for its parties, and Michael Cohen at Facebook, Steve Seitz at Google, and me were talking. And then Steve just kind of randomly says, "Hey, did you know that Ira has a company now?" And Michael Cohen's like "What?! This is interesting," then one thing leads to another and the company's acquired.

So, I'm at Facebook plus UW now. And it's really fun to do, they're ten minutes away in Seattle. And some lessons were kind of interesting for myself and maybe will be useful for you. Research, academic research means becoming a specialist in a very, very narrow field. And that could be considered as a bad thing maybe because you're in some particular niche, you're maybe stuck. But on the other hand, the way I see it, it's a unique opportunity to know when technology is right for a product, and you're actually in a unique space to do it before everyone else can. Before everyone else realizes.

Making products that millions use is super fun, but I find it really just exciting to just create something that I will use first. Because if everyone else will not like it, then at least one person likes it. Connections you make during school, postdoc, and jobs are the best. Do not forget to go to parties.

Watch Professor Ira Kemelmacher-Schlizerman's keynote at our LDV Vision Summit 2017 below and checkout other keynotes on our videos page.

Early Bird tickets are now available for the LDV Vision Summit May 23 & 24, 2018 in NYC to hear from other amazing visual tech researchers, entrepreneurs and investors.

We are accepting applications to our Vision Summit Entrepreneurial Computer Vision Challenge for computer vision research projects and our Startup Competition for visual technology companies with <$2M in funding. Apply now &/or spread the word.

Lyft & Arteris Discuss How Autonomous Vehicles, the Most Disruptive Innovation of a Generation, Will Impact Society

At the LDV Vision Summit 2017, Josh Brustein of Bloomberg Businessweek asked Taggart Matthiesen of Lyft and Charles Janac from Arteris - how will, autonomous vehicles, the most disruptive innovation of a generation, impact society?

Ultimately, says Charles, autonomous driving will be one of the most meaningful changes in how we move people and goods in the history of the world. It won't be just about manufacturing the vehicle, but creating an integrated experience according to Taggart. Watch their panel discussion to learn more:

Our fifth annual LDV Vision Summit will be May 23 & 24, 2018 in NYC. Early bird tickets are currently on sale. Sign up to our LDV Vision Summit newsletter for updates and deals on tickets.

Glasswing Ventures & GM Ventures Agree, Combining Vision with Additional Functionalities Poses Immense Opportunity

Jessi Hempel of Backchannel sat down with Rudina Seseri of Glasswing Ventures and Rohit Makharia of GM Ventures to discuss trends and investment opportunities in visual technologies at the LDV Vision Summit 2017.

An amalgamation of technologies that work together is most interesting for Rohit at GM. While Rudina says that Glasswing is seeing both startups that are trying to retrofit themselves with vision as part of their value proposition as well as startups, mostly coming out of universities, that are solving a real technical problems with vision. They both agree that multimodal functionality on devices - i.e. vision, voice, touch, etc - will open a whole new universe of experiences and products. Watch their panel discussion to learn more:

Our fifth annual LDV Vision Summit will be May 23 & 24, 2018 in NYC. Early bird tickets are currently on sale. Sign up to our LDV Vision Summit newsletter for updates and deals on tickets.

Albert Wenger's Views On Investment Opportunities In Visual Technologies & Data Network Effects

Evan Nisselson had the opportunity to sit down with Albert Wenger, Managing Partner at Union Square Ventures to discuss future investment trends and early stage opportunities at the LDV Vision Summit 2017.

According to Albert, the key to generating above average investment returns, is going where others aren't. Watch their fireside chat to learn more:

Our fifth annual LDV Vision Summit will be May 23 & 24, 2018 in NYC. Early bird tickets are currently on sale. Sign up to our LDV Vision Summit newsletter for updates and deals on tickets.

45 Billion Cameras by 2022 Fuel Business Opportunities

LDV Capital - 5 Year Visual Tech Market Analysis 2017.001.jpeg

Exclusive research by us at LDV Capital is the first publicly shared, in-depth analysis which estimates how many cameras will be in the world in 2022. We believe it is a conservative forecast as a additional sectors will be included in future research.

The entire visual technology ecosystem is driving and driven by the integration of cameras and visual data. Visual technologies are any technologies that capture, analyze, filter, display or distribute visual data for businesses or consumers. They typically leverage computer vision, machine learning and artificial intelligence. 

Over the next five years there will be a proliferation of cameras integrated into products across industries and markets. A paradigm shift will take place in the meaning and use of a camera.

Taking into account the industries that will embed cameras into products, those that will add additional cameras to products, and new vision-enabled products that will arise, the number of cameras will grow at least 220% in the next five years. 

This growth in cameras delivers tremendous insight into business opportunities in the capture, analysis and interpretation of visual data. Cameras are no longer just for memories. They are becoming fundamental to improving business and society. Most of the pictures captured will never be seen by a human eye.

This 19 page report is the first of a multi-phased market analysis of the visual technology ecosystem by LDV Capital. Facts and trends include:

  • Global Camera Forecast
  • Paradigm Shift in Visual Data Capture
  • Depth Capture & New Verticals Driving Growth
  • LDV Market Segments To Watch
  • Visual Technology Ecosystem Growth
  • Processing Advances Enable Leaps in Visual Analysis
  • War Over Artificial Intelligence Will Be Won with Visual Data

Key Findings:

  • Most of the pictures captured will never be seen by a human eye.
  • A paradigm shift will take place in the meaning and use of a camera.
  • Over the next five years there will be a proliferation of cameras integrated into products across industries and markets.
  • Where there is growth in cameras there will be tremendous business opportunities in the capture, analysis and interpretation of visual data.
  • Depth capture will double the number of cameras in handheld cameras.
  • By 2022, the number of cameras will be nearly 12X the 2012 figures.
  • Your smartphone will have between 4 and 10 cameras by 2022.
  • The Internet of Eyes will be larger than the Internet of Things. 
  • In the next five years, robotics will have 20X more integrated cameras.
  • By 2022, all new vehicles will be equipped with more than 25 cameras and this does not include Lidar or Radar.

Download the full report from our Insights page.

We look forward to hearing your insights, learning about your startups and reading your research papers on how businesses are addressing these challenges and opportunities.

Timnit Gebru Wins 2017 ECVC: Leveraging Computer Vision to Predict Race, Education and Income via Google Streetview Images

Timnit Gebru, Winner of the 2017 ECVC © Robert Wright/LDV Vision Summit

Timnit Gebru, Winner of the 2017 ECVC © Robert Wright/LDV Vision Summit

Our annual LDV Vision Summit has two competitions. Finalists receive a chance to present their wisdom in front of 600 top industry executives, venture capitalists, and companies recruiting. The winning competitor is also awarded $5,000 Amazon AWS credits. The competitions:

1. Startup competition for promising visual technology companies with less than $2M in funding

2. Entrepreneurial Computer Vision Challenge (ECVC) for computer vision and machine learning students, professors, experts or enthusiasts working on a unique solution to empower businesses and humanity.

Competitions are open to anyone working in our visual technology sector such as: empowering photography, videography, medical imaging, analytics, robotics, satellite imaging, computer vision, machine learning, artificial intelligence, augmented reality, virtual reality, autonomous cars, media and entertainment, gesture recognition, search, advertising, cameras, e-commerce, visual sensors, sentiment analysis, and much more.

The ECVC provides contestants the opportunity to showcase the technology piece of a potential startup company without requiring a full business plan. It provides a unique opportunity for students, engineers, researchers, professors and/or hackers to test the waters of entrepreneurism in front of a panel of judges including top industry venture capitalists, entrepreneurs, journalists, media executives and companies recruiting.

For the 2017 ECVC we had an outstanding lineup of finalists, including:

  • Timnit Gebru, PhD from Stanford University on “Predicting Demographics Using 50 Million Images”
  • Anurag Sahoo, CTO and Mick Das, CPO of Aitoe Labs
  • Akshay Bhat, PhD Candidate and Charles Herrmann, PhD Candidate from Cornell University on “Deep Video Analytics”
  • Elena Bernardis, PhD of the University of Pennsylvania Children’s Hospital with “Spot It - Quantifying Dermatological Conditions Pixel-by-Pixel”
  • Bo Zhu, PhD of Harvard Medical School’s Martinos Center for Biomedical Imaging presenting “Blink” about synthetical human vision
  • Gabriel Brostow from University College London with “MonoVolumes” a combination of MonoDepth and Volume Completion to understand 3D scene layout

Congratulations to our 2017 LDV Vision Summit Entrepreneurial Computer Vision Challenge Winner: Timnit Gebru  

© Robert Wright/LDV Vision Summit

© Robert Wright/LDV Vision Summit

What was the focus of your winning research project?
We used computer vision algorithms to detect and classify cars in 50 million Google Street View images. We then used the characteristics of these detected cars to predict race, education, income levels, voting patterns and income segregation levels. We were even able to see which city has the highest/lowest per capita CO2 footprint.
 
As a PhD candidate - what were your goals for attending our LDV Vision Summit? Did you attain them?
I mostly wanted to meet other people in the field who might have ideas for future work or collaborations. After the competition, I was contacted by venture capitalists and people whose startups are working on related things. In addition to that, I received some interesting ideas from  conference attendees (e.g. analyzing the frequency of trash collection in neighborhoods to get some signal regarding neighborhood wealth).
 
Why did you apply to our LDV Vision Summit ECVC? Did it meet or beat your expectations and why?
I applied because Serge Belongie (Professor at Cornell Tech and Expert in Residence at LDV Capital) thought it was a good idea. One of his many research interests is similar to my line of work. Since our work has real world applications, I think he felt that presenting it to the LDV community would help us think of ways to make it more accessible. I didn’t know what to expect but it definitely beat my expectations. I have never been at a conference that brings together entrepreneurs who are specifically interested in computer vision. I didn’t know there that the vision community was so large, and that many VCs were thinking of companies with a computer vision focus (this is different from thinking of AI in general).
 
Why should other computer vision, machine learning and AI researchers attend next year?
This is unlike any other conference out there because it is the only conference I know of that is only focused on computer vision but also brings together researchers, investors and entrepreneurs. 
 

© Robert Wright/LDV Vision Summit

© Robert Wright/LDV Vision Summit

What was the most valuable part of your LDV Vision Summit experience aside from winning the ECVC?
Meeting others whose work is in a similar space: for example, people who founded companies that are based on analyzing publicly available visual data. One of the judges founded such a company. It helped me think of ways in which my research could be commercialized (if I decided to go that route).
 
Do you have any advice for researchers & PhD candidates that are thinking about evolving their research into a startup business and/or considering submitting their work to the ECVC?
I advise them to think of who exactly their product would benefit and what their API would be like. Even though I was an entrepreneur for about a year, I am still coming from a research background. So I wasn’t thinking about who exactly the customers of my work would be (except for other researchers) until my mentoring sessions with Evan [Nisselson, GP of LDV Capital].
 
What are you looking to do with your research & skills now that you have completed your PhD?
I will be a postdoctoral researcher continuing the same line of work but also studying the societal effects of machine learning and trying to understand how to create fair algorithms. We know that machine learning is being used to make many decisions. For example, who will get high interest rates in a loan, who is more likely to have high crime recidivism rates, etc...The way our current algorithms work, if they are fed with biased datasets, they will output biased conclusions. A recent ProPublica investigation started a debate on the use of machine learning to predict crime recidivism rates. I am very worried about the use of supervised machine learning algorithms in high stakes scenarios.
 

© Robert Wright/LDV Vision Summit

© Robert Wright/LDV Vision Summit

Thank You for Making Our 4th Annual LDV Vision Summit a Success!

Startup Competition Judges Day 2: (in no particular order) Judy Robinett, JRobinett Enterprises, Founder, Author "How to Be a Power Connector", Tracy Chadwell, 1843 Capital, Founding Partner, Vic Singh, General Partner, ENIAC Ventures, Zack Schildhorn, Lux Capital, Partner, Jenny Fielding, Techstars, Managing Director, Emily Becher, Samsung, Managing Director, Clayton Bryan, 500 Shades, 500 Startups Fund, Venture Partner. Dorm Room Fund, Partner, Jessica Peltz-Zatulove, KBS Ventures, Partner, Eric Jensen, Aura Frames, CTO, Claudia Iannazzo, AlphaPrime, Managing Partner, Scott English, Hearst Ventures, Managing Director©Robert Wright/LDV Vision Summit

Startup Competition Judges Day 2: (in no particular order) Judy Robinett, JRobinett Enterprises, Founder, Author "How to Be a Power Connector", Tracy Chadwell, 1843 Capital, Founding Partner, Vic Singh, General Partner, ENIAC Ventures, Zack Schildhorn, Lux Capital, Partner, Jenny Fielding, Techstars, Managing Director, Emily Becher, Samsung, Managing Director, Clayton Bryan, 500 Shades, 500 Startups Fund, Venture Partner. Dorm Room Fund, Partner, Jessica Peltz-Zatulove, KBS Ventures, Partner, Eric Jensen, Aura Frames, CTO, Claudia Iannazzo, AlphaPrime, Managing Partner, Scott English, Hearst Ventures, Managing Director©Robert Wright/LDV Vision Summit

Our 2017 Annual LDV Vision Summit was an absolutely amazing event, thanks to all of you brilliant people.

YOU are why our annual LDV Vision Summit gathering is special and a success every year. Thank You!

We are honored that you fly in from around the world each year to share insights, inspire, do deals, recruit, raise capital and help each other succeed!  

Congratulations to our competition winners:
- Startup Competition:  Fantasmo.io, Jameson Detweiler, Co-Founder & CEO
- Entrepreneurial Computer Vision Challenge: Timnit Gebru, Stanford Artificial Intelligence Laboratory, PhD Candidate

"LDV is a really interesting intersection of technologists, researchers, large tech companies, investors and entrepreneurs. There is nothing else like this out there. People are very open to sharing and helping the community advance together." Jameson Detweiler, Fantasmo.io Co-Founder & CEO

"I've never seen a conference like this - you have pure computer vision conferences like CVPR or ICCV or you have GTC-type conferences that are based on one company's resources.  This is an interesting mix of something computer vision and entrepreneurial - it is very unique in that sense, I have never seen anything like it before. It is a lot of fun." Timnit Gebru, PhD Candidate at Stanford Artificial Intelligence Laboratory

Day 2 Fireside Chat: Albert Wenger, Partner at Union Square Ventures And Evan Nisselson, General Partner at LDV Capital  ©Robert Wright/LDV Vision Summit

Day 2 Fireside Chat: Albert Wenger, Partner at Union Square Ventures And Evan Nisselson, General Partner at LDV Capital  ©Robert Wright/LDV Vision Summit

A special thank you to Rebecca Paoletti and Serge Belongie as the summit would not exist without collaborating with them!

“Loved hearing about all the practical applications for computer vision at LDV Vision Summit. Feels like the time has finally come for amazing transformation!" Jenny Fielding, Managing Partner at TechStars

The quotes below from our community is why we created our LDV Vision Summit. We could not have succeeded without the tremendous support from all of our partners and sponsors:

Panel Day 1: Trends and Investment Opportunities in Visual Technologies Moderator: Jessi Hempel, Backchannel, Head of Editorial with Panelists: Rudina Seseri, Glasswing Ventures, Founder & Managing Partner and Rohit Makharia, GM Ventures, Sr. Investment Manager ©Robert Wright/LDV Vision Summit

Panel Day 1: Trends and Investment Opportunities in Visual Technologies
Moderator: Jessi Hempel, Backchannel, Head of Editorial with Panelists: Rudina Seseri, Glasswing Ventures, Founder & Managing Partner and Rohit Makharia, GM Ventures, Sr. Investment Manager
©Robert Wright/LDV Vision Summit

"The LDV Vision Summit is vibrant, all around me there is so much curiosity and conversation because it is the people who are working on the very edge of these new technologies. These are the conversations that are going to make everything happen and you can just feel that when you're here." Jessi Hempel, Head of Editorial at Backchannel

"My main takeaway is that there are lots of people focused on so many aspects of bringing computer vision to market. This reaffirms my belief that vision is going to play a central role in so many aspects of our lives - from enterprise to retail to autonomous vehicles, etc. The LDV Vision Summit is geeky + fun. It is a collaborative, vibrant environment that brings together a community of likeminded people with very different backgrounds." Rohit Makharia, Senior Investment Manager at GM Ventures

"The LDV Vision Summit is very unique, usually academic conferences are very research focused and business conferences are business orientated. This is a unique combination of the two and, especially in a field like computer vision, with the way that it is growing, it seems very necessary. This is a fantastic place to meet both researchers and business people." Ira Kemelmacher-Shlizerman Research Scientist at Facebook and Assistant Professor at U. Washington (Sold Dreambit to Facebook)

“The energy is amazing, everyone is curious, interested outside of their wheelhouse. Everyone wants to see what is the next big thing and what are the big things that are happening right now.” Matt Rosen, Director, Low-field MRI Lab at MGH/Martinos Center for Biomedical Imaging

“There have been a lot of very exciting discussions around visual technology and autonomous driving. It is interesting to see many different perspectives on it from sensors, from AI, from computer vision, all these different perspectives coming together. It is still a futuristic technology that we want to address and the LDV Vision Summit is great because it gathers top scientists and researchers as well as VCs to discuss how to get to that future.” Jianxiong Xiao, "ProfessorX", Founder & CEO of AutoX

"LDV Vision Summit looks at the cutting edge of all visual technology...you have a lot of brainpower in the room and you can feel the wheels turning as you watch the speakers."  Mia Tramz, Managing Editor, LIFE VR at Time Inc

"Computer vision sits at the heart of the big emerging platforms including autonomous transport, robotics, AR and AI. The LDV Summit provided a great foray into the future of computer vision and more importantly the impact it has on market sectors today through an impressive lineup of speakers, presenters, domain experts and startups." Vic Singh, Founding General Partner, Eniac Ventures

Keynote Day 1: Godmother of VR Delivers Immersive Journalism to Tell Stories That Hopefully Make a Difference and Inspire People To Care, Nonny de la Peña, Godmother of VR, Embelmatic ©Robert Wright/LDV Vision Summit

Keynote Day 1: Godmother of VR Delivers Immersive Journalism to Tell Stories That Hopefully Make a Difference and Inspire People To Care, Nonny de la Peña, Godmother of VR, Embelmatic ©Robert Wright/LDV Vision Summit

“The business sector that is going to be most disrupted by computer vision and AI in the short term is transportation, so companies like Uber, taxi companies and the entire car and automatiove industry will completely change in the next  years. The coolest thing I learned this morning was from the godmother of VR, how they are looking to change journalism and the way we capture events. The Vision Summit is pretty amazing, I am really impressed by the content, I am really glad I made it.” Clement Farabet VP of AI Infrastructure at Nvidia (Sold MADBITS to Twitter)

“We are seeing visual technologies, especially combined with AI and machine learning, disrupt a broad array of existing markets and create new ones. From the role they are playing in autonomous vehicles, to transforming marketing technologies, to the roles they are playing in physical and cyber security - and of course the role they are playing around consumer electronics and robotics. It is comforting to know everyone is just as excited as I am about computer vision and AI, and to see how big the opportunity is and how early in the cycle we are as well.” Rudina Seseri, Founder & Managing Partner of Glasswing Ventures

"My second time attending the LDV Vision Summit was even better than the first.  A great mix of accomplished technical people and energetic young entrepreneurs." Dave Touretzky, Research Professor, Computer Science at Carnegie Mellon University

"It was fascinating to see a broad range of new visual technologies. I left the Summit full of ideas for new applications." Tom Bender, Co-Founder of Dreams Media, Inc.

“The LDV Vision Summit gave me the opportunity to discover new applications of computer vision and meet leaders at the forefront of really interesting innovations and startups.” Elodie Mailliet Storm JSK Fellow in Media Innovation at Stanford.

 "This cross-pollination of all different sectors is quite unique - especially coming from an academic setting. To interact with all of these different folks from industry, research and sciences, and from media really inspires me to think about all sorts of new ideas." Bo Zhu, Postdoctoral Research Fellow at MGH/Martinos Center for Biomedical Imaging

"It was enlightening and fascinating to see the potential of the tech that's driving a visual communications revolution." Scott Lewis Photography

Panel Day 2: What’s On Now?  Moderator: Rebecca Paoletti, Cake Works, CEO with Panelists: Brian Rifkin, JW Player, Co-Founder, SVP Strategic Partnerships, Michael Downing, Tout, Founder & CEO, Orlando Lima, Viacom/VH1, VP Digital ©Robert Wright/LDV Vision Summit

Panel Day 2: What’s On Now? 
Moderator: Rebecca Paoletti, Cake Works, CEO with Panelists: Brian Rifkin, JW Player, Co-Founder, SVP Strategic Partnerships, Michael Downing, Tout, Founder & CEO, Orlando Lima, Viacom/VH1, VP Digital ©Robert Wright/LDV Vision Summit

"It is a great opportunity to meet diverse people from all different industries, a good opportunity to network with interesting talks." James Philbin, Senior Director of Computer Vision at Zoox

"If you work in visual tech, you simply can't afford to miss the LDV Summit – it's a two-day power punch of engaging talks and wicked smart attendees." Rosanna Myers, Co-Founder & CEO of Carbon Robotics

"The LDV Vision Summit is somewhere in between an academic workshop and a venture capital roundtable - it is the kind of event that didn't exist before. You have academics, researchers, grad students, professors but you also have investors, VC and angels like you've never had before. It is very high energy, the atmosphere here is fun to see the two worlds come together. From the academic side, there are grad students and other researchers who have been inside a safe bubble for a long time. They are starting to hear that visual tech are really promising and they are curious about what is going on in the entrepreneurial world and the big companies out there. This is an event where there is enough familiar content for them to feel at home but enough new content, new people, contacts and so on to go outside of their comfort zone." Serge Belongie, Professor of Computer Vision at Cornell Tech

"The LDV Summit is two curated days of outside the box ideas with the key players from diverse industries that are collectively creating the future." Brian Storm, Founder & Executive Producer at MediaStorm
 

ECVC judges Day 1 (L to R) - Aaron Hertzmann, Adobe, Principal Scientist,  Ira Kemelmacher-Shlizerman Facebook, Research Scientist, U. Washington, Assist. Professor, Andrew Zhai, Pinterest, Software Engineer, Tali Dekel, Google, Research Scientist, Yale Song, Yahoo, Senior Research Scientist, Jan Erik Solem, Mapillary, CEO & Co-founder (not pictured: Vance Bjorn, CertifID, CEO & Co-Founder, Rudina Seseri, Glasswing Ventures, Founder & Managing Partner, James Philbin, Zoox, Senior Director, Computer Vision, Josh Kopelman, First Round Capital, Managing Partner, Clement Farabet, Nvidia, VP AI Infrastructure, Adrien Treuille, Carnegie Mellon University, Assistant Professor, Serge Belongie, Cornell Tech, Professor, Manohar Paluri, Facebook, Manager, Computer Vision Group, Rohit Makharia, GM Ventures, Sr. Investment Manager) @Robert Wright/LDV Vision Summit

ECVC judges Day 1 (L to R) - Aaron Hertzmann, Adobe, Principal Scientist,  Ira Kemelmacher-Shlizerman Facebook, Research Scientist, U. Washington, Assist. Professor, Andrew Zhai, Pinterest, Software Engineer, Tali Dekel, Google, Research Scientist, Yale Song, Yahoo, Senior Research Scientist, Jan Erik Solem, Mapillary, CEO & Co-founder
(not pictured: Vance Bjorn, CertifID, CEO & Co-Founder, Rudina Seseri, Glasswing Ventures, Founder & Managing Partner, James Philbin, Zoox, Senior Director, Computer Vision, Josh Kopelman, First Round Capital, Managing Partner, Clement Farabet, Nvidia, VP AI Infrastructure, Adrien Treuille, Carnegie Mellon University, Assistant Professor, Serge Belongie, Cornell Tech, Professor, Manohar Paluri, Facebook, Manager, Computer Vision Group, Rohit Makharia, GM Ventures, Sr. Investment Manager) @Robert Wright/LDV Vision Summit

"Evan sets the tone with a lot of energy, it is pretty amazing. I am typically around a lot of engineers and it is always great to get Evan up there with his big energy - he asks you honest questions. I also spend a lot of time in the hallway because you get to meet people from other years and keep up those relationships. This is an awesome opportunity to meet the whole mix, from employers, to startup people and investors." Oscar Beijbom, Machine Learning Lead at nuTonomy

"The LDV Summit is the perfect combination of a window into the future of some of the most interesting technologies and a welcoming place to make new connections. " Tracy Chadwell, Founding Partner of 1843 Capital

"The community that is assembled here, isn't anywhere else. There's not a place where all the operators in the computer vision space are in the same place at the same time. Everybody here is capturing the electricity of whats going on inside computer vision right now and being surrounded by everybody who cares about it like you do, is really invigorating. I was just having beers with the head of Uber ATG and he's making self-driving cars, I'm never going to, but he had an optimization method that is absolutely applicable to a thing I am working on, fighting human trafficking. The cross-disciplinary nature of this group creates a lot of opportunities to learn about techniques that are absolutely applicable to your problem domain that you would never see anywhere else. If you are into computer vision this is a place you need to be every year." Rob Spectre, Brooklyn Hacker. Former VP Developer Network at Twilio

"The summit far surpassed my expectations. The bringing together of entrepreneurs, researchers, executives, and investors provided for an exchange of ideas not usually possible in other forums. I definitely recommend the summit for anyone tangentially associated with computer vision and visual technologies!" Joshua David Cotton

©Dean Meyers/Vizworld

©Dean Meyers/Vizworld

Fireside Chat Day 1: Josh Kopelman, Managing Partner of First Round Capital and Evan Nisselson, General Partner of LDV Capital @Robert Wright/LDV Vision Summit

Fireside Chat Day 1: Josh Kopelman, Managing Partner of First Round Capital and Evan Nisselson, General Partner of LDV Capital @Robert Wright/LDV Vision Summit

Keynote Day 1: How and Why Did University of Washington Professor Ira Kemelmacher-Shlizerman Build Dreambit and Sell To Facebook, Ira Kemelmacher-Shlizerman, Facebook, Research Scientist University Washington, Assist. Professor ©Robert Wright/LDV Vision Summit

Keynote Day 1: How and Why Did University of Washington Professor Ira Kemelmacher-Shlizerman Build Dreambit and Sell To Facebook, Ira Kemelmacher-Shlizerman, Facebook, Research Scientist University Washington, Assist. Professor ©Robert Wright/LDV Vision Summit

Learn more about our partners and sponsors:

Organizers:
Presented by Evan Nisselson, LDV Capital
Video Program: Rebecca Paoletti, CakeWorks, CEO
Computer Vision Program: Serge Belongie, Cornell Tech
Computer Vision Advisors: Jan Erik Solem, Mapillary; Samson Timoner, Cyclops; Luc Vincent, Lyft; Gaile Gordon, Enlighted; Alexandre Winter, Netgear; Avi Muchnick, Adobe
Universities: Cornell Tech, School of Visual Arts, International Center of Photography
Sponsors: Amazon AWS, Facebook, GumGum, JWPlayer
Media Partners: Kaptur, VizWorld, The Exponential View
Coordinators Entrepreneurial Computer Vision Challenge: Hani Altwaijry, Cornell University, Doctor of Philosophy in Computer Science, Shaojun Zhu, Rutgers University, Doctor of Philosophy Candidate in Computer Science, and Abhinav Shrivastava, Carnegie Mellon University, Doctor of Philosophy in Robotics (Vision & Perception)

AWS Activate Amazon Web Services provides startups with low cost, easy to use infrastructure needed to scale and grow any size business. Some of the world’s hottest startups including Pinterest, Instagram, and Dropbox have leveraged the power of AWS to easily get started and quickly scale.  

CakeWorks is a boutique digital video agency that launches and accelerates high-growth media businesses. Stay in the know with our weekly video insider newsletter. #videoiscake

Cornell Tech is a revolutionary model for graduate education that fuses technology with business and creative thinking. Cornell Tech brings together like-minded faculty, business leaders, tech entrepreneurs and students in a catalytic environment to produce visionary ideas grounded in significant needs that will reinvent the way we live.

Panel Day 2: Trends and Investment Opportunities in Visual Technologies Moderator: Erin Griffith, Fortune, Senior Writer with Panelists: Vic Singh, General Partner, ENIAC Ventures, Claudia Iannazzo, AlphaPrime Ventures Managing Partner & Co-Founder, Scott English, Hearst Ventures, Managing Director, Emily Becher Managing Director, Head of Samsung Next Start   ©Robert Wright/LDV Vision Summit

Panel Day 2: Trends and Investment Opportunities in Visual Technologies
Moderator: Erin Griffith, Fortune, Senior Writer with Panelists: Vic Singh, General Partner, ENIAC Ventures, Claudia Iannazzo, AlphaPrime Ventures Managing Partner & Co-Founder, Scott English, Hearst Ventures, Managing Director, Emily Becher Managing Director, Head of Samsung Next Start  
©Robert Wright/LDV Vision Summit

Facebook’s mission is to give people the power to share and make the world more open and connected. Achieving this requires constant innovation. Computer Vision researchers at Facebook invent new ways for computers to gain a higher level of understanding cued from the visual world around us. From creating visual sensors derived from digital images and videos that extract information about our environment, to further enabling Facebook services to automate visual tasks. We seek to create magical experiences for the people who use our products.

JW Player is the world’s largest network-independent video platform.  The company’s flagship product, JW Player, is live on more than 2 million sites with over 1.3 billion monthly unique viewers across all devices — OTT, mobile and desktop.  In addition to the player, the company’s services include advertising, analytics, data services, video hosting and streaming.

GumGum is a leading computer vision company with a mission to unlock the value of every online image for marketers. Its patented image-recognition technology delivers highly visible advertising campaigns to more than 400 million users as they view pictures and content across more than 2,000 premium publishers.

The International Center of Photography is the world’s leading institution dedicated to the practice and understanding of photography and the reproduced image in all its forms. Since its founding in 1974, ICP has presented more than 700 exhibitions and offered thousands of classes, providing instruction at every level.

Day 2 Keynote: 100 Million Pictures of Human Cells and Computer Vision Will Accelerate the Search for Disease Treatments Blake Borgeson, Recursion Pharmaceuticals, CTO & Co-Founder ©Robert Wright/LDV Vision Summit

Day 2 Keynote: 100 Million Pictures of Human Cells and Computer Vision Will Accelerate the Search for Disease Treatments
Blake Borgeson, Recursion Pharmaceuticals, CTO & Co-Founder
©Robert Wright/LDV Vision Summit

Kaptur is the first magazine about the photo tech space. News, research and stats along with commentaries, industry reports and deep analysis written by industry experts.

LDV Capital Investing in people around the world who are creating visual technology businesses with deep domain expertise.

Mapillary is a community-based photomapping service that covers more than just streets, providing real-time data for cities and governments at scale. With hundreds of thousands of new photos every day, Mapillary can connect images to create an immersive ground-level view of the world for users to virtually explore and to document change over time.

The MFA Photography, Video and Related Media Department at the School of Visual Arts is the premiere program for the study of Lens and Screen Arts. This program champions multimedia integration, interdisciplinary activity, and provides ever-expanding opportunities for lens-based students. 

VizWorld.com covers news and the community engaged in applied visual thinking, from innovation and design theory to technology, media and education. VizWorld is also a contributing member of the Virtual Reality/Augmented Reality Association. From the whiteboard to the latest OLED screens and HMDs, graphic recording to movie making and VR/AR/MR, VizWorld readers want to know how to put visual thinking to work and play. SHOW US your story!

AliKat Productions is a New York-based event management and marketing company: a one-stop shop for all event, marketing and promotional needs. We plan and execute high-profile, stylized, local, national and international events, specializing in unique, targeted solutions that are highly successful and sustainable. #AliKatProd

Robert Wright Photography clients include Bloomberg Markets, Budget Travel, Elle, Details, Entrepreneur, ESPN The Magazine, Fast Company, Fortune, Glamour, Inc. Men's Journal, Newsweek (the old one), Outside, People, New York Magazine, New York Times, Self, Stern, T&L, Time, W, Wall Street Journal, Happy Cyclist and more…

Prime Image Media works with clients large and small to produce high quality, professional video production. From underwater video to aerial drone shoots, and from one-minute web videos to full blown television pilots... if you want it produced, they can do it.

We are a family affair! Serge, August, Kirstine and Emilia Belongie along with Evan Nisselson celebrating Timnit Gebru's win in the 2017 Entrepreneurial Computer Vision Challenge. See you next year! #carpediem ©Robert Wright/LDV Vision Summit

We are a family affair! Serge, August, Kirstine and Emilia Belongie along with Evan Nisselson celebrating Timnit Gebru's win in the 2017 Entrepreneurial Computer Vision Challenge. See you next year! #carpediem
©Robert Wright/LDV Vision Summit

Computer Vision Delivers Contextual And Emotionally Relevant Brand Messages

The power of object recognition and the transformative effect of deep learning to analyze scenes and parse content can have a lot of impact in advertising. At the 2016 Annual LDV Vision Summit, Ken Weiner CTO at GumGum told us about the impact of image recognition and computer vision in online advertising.

The 2017 Annual Vision Summit is this week, May 24 &25, in NYC. Come see new speakers discuss the intersection of business and visual tech.

I’m going to talk a little bit about advertising and computer vision and how they go together for us at GumGum. Digital images are basically showing up everywhere you look. You see them when you're reading editorial content. You see them when you're looking at your social feeds. They just can't be avoided these days. GumGum has basically built a platform with computer vision engineers that tries to identify a lot of information about the images that we come across online. We try to do object detection. We look for logos. We detect brand safety, sentiment analysis, all those types of things. We basically want to learn as much as we can about digital photos and images for the benefit of advertisers and marketers.

The question is: what value do marketers get from having this information? Well, for one thing, if you're a brand, you really want to know: how are users out there engaging with your brand? We look at the fire hose of social feeds. We would look for, for example, at brand logos. In this example, Monster Energy drink wants to find all the images out there where their drink appears in the photo. You have to remember about 80% of the photos out there might have no textual information that’s going to identify the fact that Monster is involved in this photo, but they are. You really need computer vision in order to understand that.

Why do they do that? They want to look at how people engage with them. They want to look at how people are engaging with their competitors. They may want to just understand what is changing over time. What are maybe some associations with their brand that they didn't know about that might come up. For example, what if they start finding out that Monster Energy drinks are appearing in all these mountain biking photos or something? That might give them a clue that they should go out and sponsor a cycling competition. The other thing they can find out with this is who are their main brand ambassadors and influencers out there. Tools like this give them a chance to connect with those people.


What makes [in-image] even more powerful is if you can connect the brand message with that image in a very contextual way and tap into the emotion that somebody’s experiencing when they’re looking at a photo.

-Ken Weiner


Another product that’s been very successful for us is something we call in-image advertising. We came up with this kind of unit about eight years ago. It was really invented to combat what people call banner blindness, which is the notion that, out on a web page, you start to learn to ignore the ads that are showing at the top and the side of the page. If you were to place brand messages right in line with content that people are actively engaged with, you have a much better chance of reaching the consumer. What makes it even more powerful is if you can connect the brand message with that image in a very contextual way and tap into the emotion that somebody’s experiencing when they’re looking at a photo. Just the placement alone for an ad like this receives 10x the performance of traditional advertising because it’s something that a user pays attention to.

Obviously, we can build a big database of information about images and be able to contextually place ads like this, but sometimes situations will come from advertisers that won’t be able to draw upon our existing knowledge. We’ll have to go out and develop custom technology for them. For example, L’Oréal wanted to advertise a product for hair coloring. They asked us if we could identify every image out on different websites and identify the color of the hair of the people in the images so that they could strategically target the products that go along with those hair colors. We ran this campaign from them. They were really, really happy with it.

They liked it so much that they came back to us, and they said, “We had such a good experience with that. Now we want you to go out and find people that have bold lips,” which was a rather strange notion for us. Our computer vision engineers came up with a way to segment the lips, figure out, “What does boldness mean?” Loral was very happy. They ran a lipstick campaign on these types of images.

A couple years ago, we had a very interesting in-image campaign that I think might be the first time that the actual content that you're viewing became part of the advertising creative. What we did is, for Lifetime TV, they wanted to advertise the TV series, Witches of East End. We looked for photos where people were facing forward. When we encountered those photos, we dynamically overlaid green witch eyes onto these people. It gives people the notion that they become a little witchy for a few seconds. Then that collapses and becomes a traditional in-image ad where somebody can then, after being interested by the eyes, can go ahead and click on this to watch a Video LightBox to see the preview for the show.

I just thought this was one of the most interesting ad campaigns I’ve ever seen because it mixes the notion of content and creative into one. What’s coming after this? Naturally, this will extend into video. TV networks are already training you to look at information in the lower third of the screen. It’s only natural that this will get replaced by contextual advertising the same way we’ve done it for images online.

Another thing that I think is coming soon is the ability to really annotate specific products and items inside images at scale. People have tried to do this using crowdsourcing in the past, but it’s just too expensive. When you're looking at millions of images a day like we do, you really need information to come in a more automated way. There’s been a lot of talk about AR. Obviously, advertising’s going to have to fit into this in some way or another. It may be a local direct response advertiser. You're walking down the street. Someone gives you a coupon for McDonald’s. Maybe it’ll be a brand advertiser. You see a car accident, and they’re going to remind you that you need to get car insurance.

Lastly, I wanted to pose the idea of in-hologram ads that I think could come in the future if these things like Siri and Alexa … Now they’re voice, but in the future, who knows? They might be 3D images living in your living room, and advertisers are going to want a way to basically put their name on those holograms. Thank you very much.

Get your tickets now to the next Annual LDV Vision Summit.

Get Ready to See More 3D Selfies in Your Facebook Feed

+20160525_LDV_1549.jpg

Alban Denoyel, CEO and Co-Founder of Sketchfab spoke at the 3rd Annual LDV Vision Summit in 2016 about 3D content and 3D Ecosystems and its impact on virtual reality.

At the 2017 Annual Vision Summit this week, we will be expanding upon the conversation with new speakers in the AR, VR and content creation spaces. Check out the agenda for more.

 

As Co-Founder and CEO of Sketchfab, I'm going to talk about User Generated Content in a volumetric era. The VR headsets are all hitting the market today and tomorrow it's going the be the AR headsets, and we're starting to see holographic devices. And the big question is, of course, the content. What content are we going to consume with all this hardware?

If you look at your content today, I put it in two brackets. One is to germinate content like the Henry movie by Oculus. It's really great. There are two issues with that. One is that it takes time to make and the other one is that it takes money. And the result is that there is very little studio-made VR content. And if you go to the Oculus store today, you'll see that for yourself.

And the other bracket of content, is user-generated content. And it has to be the bulk of your content. It has to be user generated. And today, user generated content for VR is mostly 360 video.

We live in a 3D world as you all know and we have six degrees of freedom. I can walk in a space in real life and VR is able to recreate the same thing and this is what we need to get a real sense of presence. And the advanced VR headsets are able have positional tracking, and let's you walk inside a space in all freedom. And so, which content is going to be able to serve this ultimate VR promise.

The good news is that we're entering an era of 3D creation for all thanks to two trends. One is much easier tools to create 3D content. I think the most iconic example of that is Minecraft. Maybe you don't think of it as a 3D creation tool, but hundreds of 3D creations coming from Minecraft on Sketchfab. Just by assembling small cubes you are able to build entire walls and then you can navigate into them in VR.

Another great example is Tilt Brush that let's you make VR content in VR, I don't know if you tried it, but it’s really fascinating. You create in VR and then you're able to revisit that in VR.

© Robert Wright/LDV Vision Summit

The second mega trend for 3D creation is 3D Capture and it is really fascinating to see how it has evolved over the past five years. The most famous project is maybe Project Tango by Google. They are shipping their first phone with a 3D sensor this summer with Lenovo. And also, if you look at the events on the Apple side, they bought PrimeSense three or four years ago.  PrimeSense was a company making the Kinect and all this points to our future iPhone with 3D camera. And the day we have an iPhone with a 3D camera you'll be able to capture spaces and people in 3D. And if you look at how we've captured the world, we started with drawing and then we started taking pictures and then we started taking videos. But as we live in a 3D world, 3D capture is going to be the next way we capture the things.

And so, here is an example with my son, William. I make a 3D portrait of him every month. I took it with just a phone so it’s hard to show a 3D file on a 2D screen. And it’s not dancing yet, but I also have dancing versions of him.

3D capture is super important but being able to distribute this content is equally important. When it comes to user generated content you have to share it online and help it travel across the web. And so that's what we do at Sketchfab, we're a platform to host and distribute 3D files. And with technologies like WebGL and WebVR we are able to browse this content in VR straight from a browser. And a pretty good example of that is we are natively supported in Facebook, which means that I can share this 3D portrait of my son, William, in a Facebook post and then prompt a VR view straight from my Facebook feed just from the browser without having to go to a store and to install crazy setup.

One area where user generated 3D content is really booming is around cultural heritage. A lot of museums are starting to digitize their collections in 3D.  But also a lot of normal people, when they go to the museums they're just starting to take pictures from various angles of statues and then publishing it on the web. They're very interesting initiatives that started like two years ago and are still happening is around what happened in Syria when ISIS started destroying art and museums there were lot of people on the internet started crowdsourcing the reconstruction in 3D of places like that. So here's an example of a temple in Palmyra that was preserved forever in a digital format.

Another very interesting vertical to me is documenting world events. Now, with this technology we're able to see 3D data from an event pretty much the day it happens. It really gives a new perspective to an event that is super interesting. On the left, you can see Kathmandu just after the terrible earthquake that happened last summer. The day it happened a guy flew a drone over Kathmandu and then generated a 3D map from it, then published it on Sketchfab. And you were able the same day to walk through the devastated Kathmandu in VR just from the web. That was pretty fascinating. And then on the right, super different, is the memorial that happened the day of Prince's death. People started putting flowers and guitars in front of a concert place and a guy just made a 3D picture and it’s a great way to document this place and this event.

3D capture is all areas of content and we are starting to see the same trends as what we saw on Instagram. People shooting their things, their food, their faces, so I think you can get ready to see more and more 3D selfies in Facebook news feeds.

Don't miss our 4th Annual LDV Vision Summit May 24 & 25 at the SVA Theatre in NYC.

Image Recognition Will Empower Autonomous Decision Making

Rudina Seseri, Founder & Managing Partner of Glasswing Ventures

Rudina Seseri, Founder & Managing Partner of Glasswing Ventures

Rudina Seseri is the Founder & Managing Partner of Glasswing Ventures. With over 14 years of investing and transactional experience, she has led technology investments and acquisitions in startup companies in the fields of robotics, Internet of Things (IoT), SaaS marketing technologies and digital media.

Rudina will be sharing her knowledge on trends and investment opportunities in visual technologies as a panelist and startup competition judge at the 2017 Annual LDV Vision Summit. We asked her some questions this week about her experience investing in visual tech and what she is looking forward to at the Vision Summit...

You are investing in Artificial Intelligence (AI) businesses which analyze various types of visual data. In your perspective, what are the most important types of visual data for artificial intelligence to succeed and why?
Nowadays, a key constraint for AI to succeed in perception tasks are good (i.e. labeled) datasets. Deep learning has allowed us to achieve "super-human" performance in some tasks, and computer vision is a key pioneering area - from LeCun's OCR in the 90s, to the new wave of AI excitement spurred by Andrew Ng – and others – in the unsupervised tagging of YouTube videos and deep nets performance in ILSVRC competition (an annual image recognition competition which uses a massive database of labeled images). 

Image recognition has now moved from single object labeling, to segment labeling and full scene transcription. Video has also seen impressive results.  An important next step will be to see how we can move from perception tasks like image recognition, to autonomous decision making. The results achieved already in games, and self-driving cars are promising. One can think of applications in just about anything from autonomous vehicles, visual search, (visual) business intelligence, social media, visual diagnostics, entertainment, etc. However, I think the most important thing for success is to be able to match the type of data and algorithm to whichever problem you're trying to solve. The ability to create valuable datasets in new use cases will be essential for startups.


I believe AI and vision will have a massive impact across sectors and industries which is why we decided to launch [Glasswing Ventures].

-Rudina Seseri


What business sector do you believe will be most disrupted by computer vision and AI?
That’s a tough one because I believe AI and vision will have a massive impact across sectors and industries which is why we decided to launch the firm. From a vision point of view, we need to ask which are the business sectors that rely (or could rely) the most on images, and those are likely to be the ones "most disrupted" by AI.  Within the enterprise, marketing and retail are likely to be one of the earliest adopters. In terms of sectors, it's easy to see the impact that AI will have on e-commerce, transportation, healthcare diagnostics, security etc.
 
You are speaking and judging at our LDV Vision Summit. What are you most excited about?
The LDV Vision Summit is a key event for anyone involved in computer vision. Being a speaker and a judge, sharing the stage with some of the pioneers in the domain and hearing the pitches of some of the most promising entrepreneurs in the area. Being able to spend two days with all of you and discuss trends and the future of computer vision is invaluable. 

You’ve said “the skillset of data scientists will be rendered useless in 12-18 months. They will need to either evolve with new AI tools or become a new category of Machine Language Scientists.” How does this rapid evolution in AI impact your investing strategy?
Data science is indeed evolving at a very fast pace. The exponential improvement in computing power, the ability of GPUs to parallelize data processing (crucial for CNNs), and the sheer abundance of data available, has required data scientists to rethink how they can better leverage these capabilities and experiment with what was previously unthinkable. While most of the algorithms considered state-of-the-art today have been developed over decades, the way in which data scientists use them, has changed considerably - i.e. moving from feature engineering to architecture engineering.

Additionally, the community has fully embraced open-source, with most breakthroughs being published and algorithms shared. This means that savvy data scientists have to: know the advantages and limitations of each approach for their use case given the new computing/data constraints; be willing to experiment with new methods and embrace open-source while being able to build a sustainable competitive advantage; and be on top of the new developments in their area.

Finally, the emergence of data science at the center of AI development has created a new, major stakeholder in product teams (along with engineering & PM) so a good dynamic between these three teams, with constant collaboration to push the limits of technology, while always focusing on creating a product that delivers superior value vs status quo to their target customer is key.

This is the last week to get your discount tickets for the 4th Annual LDV Vision Summit which is featuring fantastic speakers like Rudina. Register now before ticket prices go up!

The Perfect Storm For Computer Vision To Reform Life Sciences

Judy Robinett, Founder & President of JRobinett Enterprises

Judy Robinett, Founder & President of JRobinett Enterprises

Judy Robinett is known for her titanium rolodex. She speaks and consults with professionals, entrepreneurs, and businesses on the topics of strategic networking, relationship capital, startup funding strategy, strategic alliances, and leadership. Her book, "How to be a Power Connector" was selected as the #1 Business Book for 2014 by Inc. Magazine. In her more than 30 years of experience as an entrepreneur and corporate leader, Judy has served as the CEO of public and private companies, in management positions at Fortune 500 companies, and on the advisory boards of top tier venture firms. 

We are excited to have Judy as a judge at our 2017 LDV Vision Summit startup competition. Evan caught up with her in this last week before the Vision Summit about her experience investing in startups and the things she is looking forward to at the Vision Summit...

You see many startup pitches - what are the three most important reasons you would choose to invest in a team?
Character rules. While on first blush there are many solid deals with purported upsides, my first focus is on character.  Howard Stevenson who is known as the lion of entrepreneurism at Harvard once told me that the first time someone was untruthful, he walked knowing he would lose all his money regardless of how promising the deal was.  Howard invested in well over 90+ startups and his book, “Winning Angels, the 7 Fundamentals of early stage Investing,” is a must read.

Besides dishonesty, I try to avoid bad actors. Life is too long to deal with the dark triad; narcistic, Machiavellian and sociopaths.  I want to make sure these are folks I like, trust and want to work with for the long haul.

Top questions I ask myself; does the team exhibit emotional IQ, how do they deal with conflict? Litigation is emotionally and financially draining so better to determine ahead of time what happens if there is a blowup.

Second, does the team show the ability to learn and pivot. Are they coachable? How do they deal with feedback? It’s a rare biz model that doesn’t morph.

Finally, vision is grand but do they have a customer.  Paul Graham of Y-Combinator said there are only two reasons startups fail.  First is lack of a customer and second is lack of funding.  A few months ago a founder called me after spending $12M to build out a platform only to discover no one wanted it. 

Judy Robinett, LDV Community dinner May 2016 © Robert Wright

Judy Robinett, LDV Community dinner May 2016 © Robert Wright

What business sector do you believe will be most disrupted by computer vision?
I’d bet on life sciences.  Over 25 years ago scientists tried to use AI to diagnose cancer but failed.  With Intel’s new 10 nm chip out later this year, we will find ourselves in a perfect storm—processing power, adoption across sectors globally, more investments, projected 150M users of consumer AI. We will be able to solve problems we couldn’t before.  Think of  IBM’s Watson who now has read every single research paper available on cancer, tracks all clinical trials and then imagine what will happen when pictures are added.
 
Future surgeons will be able to perform simulated operations.  We will see better diagnostics, improved quality care and lower prices.
 
At the White House Fintech summit, many in the audience were startled to see the adoption of roboadvisors like Betterment uses for financial advice.  The same arguments were used a few years ago that this industry needed to have ‘high touch and high trust’ that new technology wouldn’t work.

What was your biggest mistake as an investor and what did you learn from it?
I got a $100K brick to the head when I fell in love with the product and drank the Kool-Aid from a charismatic founder with a big idea. My big take-away was that I needed to step back, do the math, think hard about the assumptions and then do a gut check; then and only then, would I invest.
 
Now I have Jack Welch’s quote, “Get Better Reality,” posted on my wall.

We are honored to have you judge the 2017 LDV Vision Summit startup competition. What are you most excited to see at our Summit?
I would love to see folks figuring out motion sickness, cancer diagnosis, next-gen headgear and MRIs, expanded content but honestly, I want to be surprised!

Accenture Technology global trends now lists AI as the number one trend saying, “ AI will be the new UI”. 

I’ve been watching new players from Russia—Ntech Labs with facial recognition to Boston’s emotion AI, Affectiva which now has over 5 million facial videos.  I’ve seen the narrative change from those dissing this technology to proclaiming it will be the next industrial revolution.

At the Vision Summit, I'm interested in seeing how speakers from companies like Upskill, PathAI, OrCam and AutoX showcase how they are utilizing AI and computer vision to revolutionize their industries as well.

You wrote a great book called “How to Be a Power Connector” What was your goal in writing this book? Do you have one exciting success story after writing this book that you can share?When I was CEO of a small public biotech I frequently spoke at BIO.  I grew frustrated that promising drugs were falling to the wayside because people couldn’t connect the dots to critical resources needed for success.   With 7.4billion people on earth, $369 trillion in global private wealth projected by 2019 by Credit Suisse and countless ideas, information said to double daily with IOT, there is no lack of resources.  But most people are in the wrong room talking to the wrong people asking that Old Testament Job question—why me?
 
I wanted to change that by providing a clear, easy-to-follow roadmap with the latest research on strategy and relationships.  Nothing happens without people.
 
Three months after McGraw-Hill published my book, I heard from a young man in Africa who had applied my principles and obtained funding.  I did the happy dance.

Last chance to get your discount tickets to the 4th Annual Vision Summit, now through May 19!

Nicolas Pinto Predicted the Deep Learning Tsunami but Not the Velocity - Selling to Apple Was His Answer

Nicolas Pinto, Deep Learning Lead at Apple © Robert Wright/LDV Vision Summit

Nicolas Pinto, Deep Learning Lead at Apple © Robert Wright/LDV Vision Summit

Nicolas Pinto is focused on mobile deep learning at Apple. At the 2016 LDV Vision Summit he spoke about his journey as a neuroscientist and artificial intelligence researcher, turned entrepreneur who sold his company to Apple.

Discount tickets are available until May 19th for the 2017 Annual LDV Vision Summit to hear from other amazing researchers & entrepreneurs about their route to selling a successful business.

Good morning everyone. My name is Nico and I have ten minutes to talk to you about the past ten years. When I went from being a neuroscientist at MIT and Harvard then moved to create a self deep learning startup in the Silicon Valley and finally sell it to Apple. Lets go back in 2006, about 10 years ago. That's when I started to work with neuroscientist Jim DiCarlo at MIT and David Cox at Harvard. They had a very interesting approach. They both wanted to do reverse and forward engineering of the brain in the same lab. Usually, these things would be done in different labs, but they wanted to do both in the same lab. I thought the approach was very interesting. They really wanted to do, to study natural systems of real brains, the system that works and also build artificial systems at scale. Scale that approach at natural systems, so really big scale.

What do we see when we study natural systems? Let me go very quickly here. The first thing is we see when we study the visual cortex, we are focusing on vision obviously, is that it's basically a deep neural network, composed of many layers and that kind of share similar properties. Properties that are kind of similar, repeated patterns. A lot of these properties have been described in literature. Many, many, many, many studies from the sixties have described these things in the physiology literature so you can look at them. What other things we saw is like, if you look at the modeling literature, there are many, many, many, many different ideas about how these things could be working, starting in the sixties.

I think this will work really well.

Many, many, many, many, many different studies, many different models, many different ideas and parameters. Many ideas and ultimately culminating in convolutional neural networks. That's probably what you've heard of. This convolutional networks have been popularized by Yann LeCun in the machinery community but also by Tommy Poggio in the computational neuroscience community with the HMAX model. All these models cannot work the same, cannot look very different. You don't really know and what's very interesting about them is that they have very specific details and some of them have to do with learning, some of them have to do with architecture. It's really hard to make sense of all that in terms of learning aspects, there are many ways you can do learning. Many, many, many, many, many different ideas about how you can do learning starting in the sixties and moving on from computational neuroscience into machine learning. Many different ideas. I'm not going to go into it but like so many different years it's so hard to explore this particular space.

© Robert Wright/LDV Vision Summit

© Robert Wright/LDV Vision Summit

What we saw back in the days that the hypothesis space of all these different ideas, and you know a combination of all of them, was overwhelming to explore. As a graduate student, you're looking at all of this, they all kind of make sense, kind of not, you're not really sure how to combine them. The space was largely unexplored. If you just take one particular idea, for example here, you will see that one particular idea has many, many, many, many, many different parameters depicted in red here. These parameters are many, like, many, you have a lot of them and you have a lot of models and it's very, very overwhelming. Again, here, for deep learning, so many parameters. How do you set those parameters?

The usual formula is that you take one grad student in a given lab. You just take one particular model, usually the model will be derived from the particular lab and the size will be limited by runtime. At the time everyone was running in MATLAB. You tweak all different parameters by hand, one by one, all by hand, and you hopefully can crush a few benchmarks. You hope that you can get this kind of work published and you claim success. Don't forget, at the end of all of this, you get one Ph.D.

But if you tweak all of these different parameters by hand and one by one, not really knowing what you want to do, it's a little bit like what some people will call graduate student descent. Just taking one grad student and kind of exploring this space slowly at a time. That's very aggravating and very, very boring.

We wanted to do something a little bit differently. We would take one grad student, that would be me in this case. But we wanted to do is test many, many, many, many different ideas and take big models, big models at kind of scale and approach the scale of natural systems. Hopefully we could crush a few benchmarks as well. Maybe we could even get that published and hopefully get one Ph.D at the end of it.


If you want to have really good ideas, it's fairly simple. You just need to get many, many, many, many different ideas and just throw the bad ones away.

-Nico Pinto


The inspiration, I got it from this guy, Linus Pauling, double Nobel prize winner. It told me, well it told everyone, if you want to have really good ideas, it's fairly simple. You just need to get many, many, many, many different ideas and just throw the bad ones away. Very simple. In biology we are told to do that. It's called high-throughput screening. Very face to fancy name and it's a very beautiful technique that cannot imitate natural selection. Let me show you how it works in biology.

What you do is you plate a diversity of organisms, like you're looking for some organism that will have some properties you're looking for. You allow them to grow and interact with the environment. You apply some sort of challenge to the properties you're looking for. You collect the surviving colonies and ultimately you study and repeat until you find an organism that fits the bill. In biology inspired computer vision, you'll do the same thing. You'll generate a bunch of random models from these many, many different ideas. You know, some from you, some from the literature, apply some sort of learning to learn the synaptic ways and interact with the environment, test with a screening task. A particular vision task is, in this case, skim off the best models, study, repeat, and validate on other tasks and hopefully get the proper things you're looking for, a really good visual system.

What's nice about this particular technique is that even though we call it high-throughput screening, it's basically just brute force. Right? I mean it's a very nice name, we like that as scientists, but it's just pure brute force. In this particular case we needed to have a lot of compute power to run all of these models. But we were back in 2006 and this is basically at a time where these GPUs is very cheap graphics processing units, started to be very, very powerful and actually programmable. We got lucky because we found that trend very early on. Basically ten years ago. The problem with this is that it took a while in the beginning because it was quite complicated to program. We had to build everything from scratch. The computers, the software stack, the programming, it was a lot of fun but quite hard at the beginning since there was no library for it. We even went as far as building a cluster of precision trees back in 2007 to get this raw brute force power that we needed for these particular experiments.

We also got access to hundreds of GPUs because at the time national computing, supercomputing centers were building a lot of these supercomputers with tons of GPUs but not many people knew how to use them properly so we had access to all of them because they wanted to get used. With a brute force approach we can do that. We also taught these things back in 2008 and 2010 at Harvard and MIT how to use these GPUs to be able to do more computational science cheaply with these graphical units.

Let me skip forward. We basically applied this technique, came up with a model that is all encompassing, has tons of parameters and can now compress all these different ideas as I just mentioned. Apply our brute force technique and hopefully at the end of the day we basically got very good results and we we were very surprised. The results that we got were so surprising that even got featured in Science in 2010. Not only we were surprised, but even Yann LeCun himself was surprised. He told us that some of this work was actually influential in such a way that we uncovered some very important non-linearities using this kitchen sink approach.

© Robert Wright/LDV Vision Summit

© Robert Wright/LDV Vision Summit

Since we had some very interesting kind of results, we wanted to see if we could apply this to the real world. We compared of technology to a commercial system called face.com, got bought by Facebook. We were able to crush their performance. We even got in touch with Google back in 2011 and they told us that we were basically kind of influencing a little bit of their work, the early Google brainwork back in 2008.

We decided to start a company based on this and this, the company was called Perceptio. It was a very early startup, you probably won't see much information about it. The goal of Perceptio was basically to come up with some brain inspired A.I. that you can trust. Trust was very important to us. We wanted to make sure we preserved the privacy of users.

Why a startup? Well, what we saw is that we wanted to obtain nice progress but we saw that academia and industry were kind of optimizing for progress but kind of not. On one side you had academia where we would be a credit economy. In a credit economy, what you do is you plan flags and you guard territory. It's all about me, me, me first. You don't really know what's going on, you just have to plant flags, that's how you get a career. In industry it's a profit economy. You have to make money and all of time what that means they're selling user data. We wanted to kind of create a new organization that would not be operating like this. We had grandiose ideas like many others. It didn't work out in the end but we got this grandiose idea of starting this intersection of incubator, industry lab, academic lab, focusing on progress only. It didn't work out but that's what we wanted to do.

The applications that we were focusing on was a small social camera and our moats, our competitive advantage was going mobile first. Everyone was going in the cloud, we wanted to bet against the cloud and go mobile first. Everyone was surprised at the time. It was 2012. Everyone was running deep learning in hundreds of thousands of CPU cores or big GPUs. I'd say, "Why would you even try to do this? It's not even possible." If you look at it carefully, it cannot be done on the computer side. Not much compute going on. But ultimately we did it and a lot of the things that we uncovered back in 2012 are now being rediscovered by the community. Some people claimed that we could not do it because we would not see enough data. Well it turns out if you're the most popular camera in the world, you get to see a lot more data than the cloud. This camera, if you sit right next to the sensor, you can get dozens of frames per second and only a fraction of that will go in the cloud.

People got it now. We could preserve privacy. Ultimately, we were able to predict the timing of the deep learning tsunami but not it's velocity. We had to scale with this small company and the only way for us to scale was to go to an acquisition. The problem with the acquisition is that most of the companies will go about this profit economy by selling user data so it was really hard for us to find the right home for our technology and scale. We thought very hard and we kind of found a little small fruit company back in Cupertino that has, you know, think very differently. They think different about these things and they really do care about the user's privacy and not selling your user data. That's where I am right now. That's where Perceptio is right now and that's it. This is like the end of the ten minutes, ten years of my life. Thank you very much.

Get your tickets now to the next LDV Vision Summit to see other phenomenal speakers with stories like Nico's.

Creating Compelling VR & AR Experiences at LIFE VR

Mia Tramz, Managing Editor of LIFE VR at Time, Inc.

Mia Tramz, Managing Editor of LIFE VR at Time, Inc.

Mia Tramz is the Managing Editor of LIFE VR at Time. At the 4th Annual LDV Vision Summit, Mia will be giving a keynote on "Pioneering VR & AR for Media Companies." Evan had a chance to catch up with her in May about the virtual reality and augmented reality projects she is producing at LIFE VR and where she thinks virtual reality will go in the next 5-10 years.

Discount tickets still available for the next LDV Vision Summit May 24 & 25 in NYC.

Please share with us how and why you evolved from a more traditional photo editing career into multimedia and now the Managing Editor of LIFE VR.
Immersive, non-traditional storytelling has always been of interest to me and even as a photo editor on TIME.com I was looking to push the boundaries of the way in which we tell stories visually. In 2014, about a year after I had been hired, I produced TIME’s first underwater 360 video, Deep Dive, with Fabien Cousteau. That project set me on a path towards VR – soon thereafter I started researching how to produce a VR experience for TIME. Around the same time, LIFE VR – Time Inc’s company wide VR initiative – was approved and they were looking for ideas to put into production. I worked with several of TIME’s editors and reporters to come up with a list of experiences we could produce that year – which ended up being about ten pages long. I think when they saw my early enthusiasm for the medium and how much leg work I had done, they felt I’d be the right person to launch the brand.

Deep Dive with Fabien Cousteau - Courtesy of TIME

Deep Dive with Fabien Cousteau - Courtesy of TIME

What is your biggest challenge in creating VR and AR content today?
I’ve found the challenges for creating AR and VR to be quite different. 

With VR the biggest challenge is getting proper resources behind the projects I feel are most important to produce. These are often ambitious, moon shot projects with price tags to match. Raising capital to make sure those projects are properly produced is no small feat. I’ve been very lucky to have had early support from Time Inc and our brands in both creating and promoting VR and 360 video; outside of our company, we've been fortunate to have support from partners such as AMD and HTC on past projects such as Remembering Pearl Harbor. We’ve also been lucky to work with supportive production partners who have, in many ways, made it possible to achieve a very high quality of storytelling.

With AR, the biggest challenge is producing content that isn’t a gimmick – it needs to have inherent value for the consumer so that activating it isn’t a chore. It should feel delightful and compelling, and the user should feel that they got a return that was worth their time and effort, just like with any digital content. There’s many ways to implement AR, both editorially and for advertising clients. With the AR camera now available in the LIFE VR app that we launched last week with the Capturing Everest issue of Sports Illustrated, we can launch 2D and 360 video content as well as 3D CGI animation and graphics off of both our print products and pretty much any other physical object. We can also make the pages of our magazines, including advertisements, shopable. Parsing out the most impactful way to implement AR throughout our brands – that serves both our editorial and sales teams – will be an exciting challenge in the months to come.

Remembering Pearl Harbor - Courtesy of LIFE VR

Remembering Pearl Harbor - Courtesy of LIFE VR

I am sure you see many different story opportunities to publish in VR. How do choose the best stories to produce VR today? Can you give a couple examples?
At this early stage, it’s a lot to ask a consumer to download our app, find a headset and then dedicate time to watching the experiences we create. My guiding principle has been that any experience we produce or distribute has to be compelling enough that a consumer would go to all those lengths to be able to watch it – and that it delivers once they’ve invested the time and energy. Beyond that, an important part of my job is finding unique ways to bring the DNA of LIFE Magazine to the work we do. So, for example, LIFE covered the attack on Pearl Harbor extensively; when we were looking into historical events to recreate, it was a moment in history that LIFE – and TIME – could speak to authoritatively and something that we could weave LIFE imagery and reporting into. With Capturing Everest, the VR and AR project we just launched with Sports Illustrated, LIFE famously covered Sir Edmund Hillary and Tenzing Norgay’s first ascent of the summit with iconic photography and written reporting. Tackling the first bottom-to-top climb of Everest in VR allowed us to bring the spirit of that storytelling to a new audience, and a new generation.

Capturing Everest © LIFE VR & Sports Illustrated

Capturing Everest © LIFE VR & Sports Illustrated

Which business sector do you believe will be most disrupted by VR and why?
At the moment I see VR augmenting and supplementing business sectors versus disrupting. If you look at education, it’s an incredibly powerful learning tool that enhances the curriculum teachers already have in place; it’s been a great tool for the military and medical fields for decades; when applied to film making and journalism, creators now have the option to weigh it against other more traditional methods of covering a story such as photo or video. In each case, I see VR becoming another tool in the tool box, not necessarily a disrupter or replacement. 

Depending on how quickly facial recognition techniques evolve, I could see video conferences perhaps eventually being replaced by VR or AR conferences. The gaming industry may present the biggest question mark – but again it seems like VR will be a great option among many others, not necessarily a replacement for existing gaming consoles.

What excites you about speaking at our LDV Vision Summit?
Sharing of information is such a key part of the development of any industry or innovation, especially in its early stages – I’m a big believer in collaboration and the ‘all ships rise’ approach. What we are able to inspire in and learn from each other will shape the future of AR, VR and MR as much as what we are able to invent and discover on our own. Getting to share what I’ve learned and to learn from others is the most exciting part of participating in the summit.

What is not possible in VR today that you hope will be possible in 5-10 years?
The headsets themselves right can be limiting. Implementation of inside-out tracking and accommodating for AR and MR in addition to VR are all innovations that seem to be on the way that I think will support both user adoption and content creation. 

The realistic rendering of human faces and registering of emotions in CG is also still a huge challenge and prohibitively expensive in most cases. Companies like 8i have developed methods for volumetrically capturing living people and their technology is improving day by day; incorporating some AI seems to be a necessary next step. When it comes to people who are no longer with us – or who never existed in the first place – rendering a face, emotions and responses that are believable is a big challenge. 

To experience LIFE VR, download the LIFE VR app for iOS and Android; visit the LIFE VR channel on Samsung VR; or visit time.com/lifevr. Certain experiences are also available on Steam, Viveport and in the Oculus Store.

reCaptcha Fights Spam And Sparked The Burger vs. Sandwich Debate

Ying Liu, Software Engineer at Google ©Robert Wright/LDV Vision Summit

Ying Liu, Software Engineer at Google ©Robert Wright/LDV Vision Summit

Ying Liu is a Software Engineer at Google and was part of the team that created reCaptcha nine years ago. reCaptcha has changed a lot since then and there are some really exciting initiatives she is working on at Google that she told us about at the 3rd Annual LDV Vision Summit in 2016.

Tickets for the upcoming Vision Summit are on sale - get yours now!

reCaptcha is an anti-abuse tool. Our mission is to keep the internet organic, green, free of spam abuse. Earlier I was talking to someone in the group, trying to describe what reCaptcha is. At first he didn't get it, so I started verbally describing this reCaptcha, suddenly he got it and said "oh, it's that annoying thing on the internet". First of all it's very sad to hear you associating that with reCaptcha's brand. I hope by the end of the session, I will change your mind on this.

©Robert Wright/LDV Vision Summit

©Robert Wright/LDV Vision Summit

This is reCaptcha 7, 9 years ago. What we did at that time, is we distorted synthetic text so that computers can not read it but humans still can. As computer vision improves, OCRs are getting better, and machines are getting really good at recognizing this kind of distortion. As a result, we have to change the distortion harder and harder until it looks like this 3 years ago. I'm going to give you a second, try to transcribe what it says but don't blame me if it hurts your eye.

What we did is say - let's test this on humans. See how humans can solve them. And for the known humans, only one-third of them can recognize this. And then we were saying "Okay, how about machines?" Computer vision's getting really good. So we trained these hard captchas on the advanced machine learning system inside Google. And guess what? They can solve it at 99.8 % accuracy. The whole game changed around. Now reCaptcha is easy for bots, for machines, and hard for humans. That's the time we know that we have to totally change the game in order to get back into the game.

This is what we launched a year and a half ago in late 2014. This is the new reCaptcha experience. We call it "No captcha reCaptcha".

©Robert Wright/LDV Vision Summit

©Robert Wright/LDV Vision Summit

Here's how it works. You are presented with a check box. It's a checkbox where it's labeled "I'm not a robot". What you do as a user is click on the checkbox to prove to us - reCaptcha - that you're indeed a human. If we can verify that you're a human, you'll come back with this. But the story is not that simple. In the back end, we have an advanced risk analysis system that based on your click and several interactions with us, we can pre-classify you. Between a spectrum of human and bots. If we think you're a human, a green check is returned automatically. If you're a bot - we tell you you're a bot- we reject you right on the spot.

For every other case where we're not so sure or we think that you're kind of suspicious, we give you different captcha challenges. Here I'm just explaining two examples.

The one on the left is a 3 by 3 grid of natural images. Where you as a user is to select all the common objects among them. The one on the right is harder. It's actually given one picture and you're asked to localize where exactly that object is. As of today in 2016, this is still considered a difficult task for an advanced AI. I know earlier in today's session, people were saying "oh the image recognition is a solved problem." Well unfortunately, it's not solved to us. Until that we have some off-the-counter solution that says "I can recognize any object in the world".

We launched a year and a half ago. How did this new captcha experience do? I'm going to share some of the insights.

In the past one and a half years, we grew our footprint over the internet. Now we have over a million 7-day active clients. And the captcha widget that we're showing, another robot, speaks 56 languages and is covering 240 countries and regions. Everyday we receive hundreds of millions of captcha solutions. Among all the correct solutions, roughly one third of them are coming from the "no Captcha" experience. NoCaptcha is defined as the more direct pass with solving a visual test.

Our mission is to keep the internet free of spam abuse. To do that, we can not drive humans away. To improve the humans usability of reCaptcha has always been our top priority. So in View 1, because of the pre-classification that I was talking about earlier, in View 1 we can serve them much easier tasks if we pre-classify them as humans. That means it's easier text distortions and we're getting a 89% pass rate. Pass rate here is defined as the total number pass solutions over total number solutions. In View 2, that's getting much better. The pass rate increases to 96% which means for the remaining 4% of humans, you can always try again.

Solving captcha has been much faster and faster for human users. Again, in View 1 because of the text distortion, you have to type in through a keyboard, which is particularly cumbersome for mobile users. In View 2, that becomes two mouse clicks or even screen touches. By doing that, we shorten the solving time of a captcha to almost a half. That's a few seconds we've saved the internet users for every captcha solving. Cumulatively, that is 50,000 hours that we save the internet every day. 50,000 hours, that is almost 6 man years. This is a lot of time that you could watch cats and dogs videos online on YouTube rather than solving captchas.

Captchas is getting easier for human users. Here, we're showing some stats from the bots analysis. For the pre-classified bots, we give them much harder captchas. Here we're showing a significant attrition rate for bots. The blue bar is how many times they click on "I'm not a robot" and the red bar is how many times they actually attempt to solve a captcha. As you can see, only 5% of the clicks leads into a solution. And for the remaining 95% of the bots, they basically abandon the experience and walk away. We tried the same thing on human users. Is it because it's a hard captcha and people walk away? It's strange now, for human users more than 90% of the time, they actually try to attempt to solve the captcha.

This is the overall pass rate we observed globally from reCaptcha view one. Here is colored coded as red being a low pass rate; meaning most of the solutions failed. Green is high pass rate; most of them succeeded. As we move into the view two experience, the noCaptcha experience, the map is turning into a land of green. This is a very good thing for all internet users. Because whenever you encounter a reCaptcha View 2 on the internet, you're most likely to solve it correctly. Unless you're a bot, in which case you're going to walk away.

To recap what I said just now, reCaptcha is getting easier and faster for human users and getting harder for bots. But this is not the end of our story. The other part I want to share with you is how reCaptcha is helping to improve and to make humanity better.

©Robert Wright/LDV Vision Summit

©Robert Wright/LDV Vision Summit

When we started reCaptcha 7, 9 years ago, it was an anti-abuse tool but most importantly it's also to help to digitize books. So remember in view one, we're showing two words. One is a text distortion where you're using to verify that you're human. The other word is actually coming from a book scan, so it's a book word. If you answer the verification word correctly, we also think that you transcribe the book word correctly. So, in doing so, we have transcribed millions of books.

After the books digitization, we tried reCaptcha on street number and street names transcription. Here we gathered the largest image training that is online. And we have donated a significant chunk of it to the open research community. This is helping us to build a better maps experience and more accurate maps for the whole internet users.

You can pretty much guess what I'm trying to say here. As we move into the View 2, we're showing natural images for labeling. We're gathering the internet intelligence to help teach machines and making AI smarter.

We're also celebrating holidays with internet users. Here are two example pictures from new years captchas.

The other thing that I didn't show here - as I was talking to some of you during the break - there are some funny things happening at reCaptcha. We started the biggest debate on the internet about what is a burger, what is a sandwich. People love to argue about those things. Or is a cupcake a cake? Those kind of discussions. So doing those, we want a lot of internet love for reCaptcha.

To conclude, my whole talk, reCaptcha is making continuous effort to fight spam on the internet. We're making the internet a better experience for all human users. We're also pushing the boundaries of research and making AI smarter.

“Killing Google With My Bare Hands” and Other Lessons Learned in Scaling with Jack Levin

©Robert Wright/LDV Vision Summit

©Robert Wright/LDV Vision Summit

Jack Levin was an early employee at Google and later started ImageShack, and is most recently at Nventify. In his keynote at the 2016 Annual LDV Vision Summit he spoke about his perspective going back to the early days of Google up to today. Tickets on sale now for the upcoming Annual LDV Vision Summit.

I was privileged to be at Google from 1999 to 2005, essentially in the beginning of my career. I was twenty three or twenty four years old when I started.

I'm here to talk about scale. What it means, how to actually scale large systems, and not essentially burn yourself and your infrastructure out. So after Google, I did spend some time running a company called ImageShack doing about two billion web hits a day, which was a pretty good scale story in itself.

©Robert Wright/LDV Vision Summit

©Robert Wright/LDV Vision Summit

So that picture is actually a rack that I built myself. It happens to be in a computer history museum in Mountain View. You can see that little label there, "JJ 17". I put that label there, it has some of my blood DNA because, as you can see, this thing is a mess and I would often scratch my hands on that network card right there. So that was my first day at Google, essentially went into the data center, had to wire all of this, bring it online, and a week later we needed to launch netscape.com, which was the first really big client for Google.

Three hundred servers, but then the next week it was plus another two thousand, which was crazy. I had no idea what I was supposed to be doing, Larry Page said, "Hey, here's a bunch of cables, plug them in, you're good to go." That was the story for the first couple years, a few years forward, that's when I stopped going to the Data Center because it was a team of twenty five people managing all of this.

So that's the second or maybe third generation of uber-racks. Tons of hardware, a lot of kilowatts being consumed, a lot of heat being generated, but a lot of queries being served.

So let's talk a little bit about disasters. I claim responsibility for killing Google with my bare hands a couple times. We didn't know what we were doing, I clearly didn't know what I was doing, I would push the wrong button, Google would go down. I would jump on my motorcycle or scooter at the time, and literally run through the data center, unplug all the power supplies, plug them back in to wipe out my configs, and everything would be back. That happened a few times until I figured out, "Perhaps I should have a dial-up to a data center so I can dial in and undo my work."

But that comes with experience. We had a team of really smart people when it came to development, but we had no clue in operations. I had no idea what I was doing, the guys who were hired were mostly IT on the corporate side. One of the biggest problems for startups that need to scale up quickly is that they are not IT people, they are not operations people - the Founders that is. They hire guys to run their data centers, but nobody knows what they're doing, or nobody knows where the pin points are, so on and so forth.

For the longest time at Google, we didn't know what might kill Google. We had a bunch of monitoring, but we didn't know how to interpret it. Back in the day, about the year 2000, the way you would kill Google is that you would send it a query, something like, "theological silhouette". The words have no meaning when combined together, but Google would search all the way to the bottom of an index. If you sent five queries per second from your laptop, you could actually kill the whole Google search engine.

That was an interesting thing, and we learned that monitoring of queries and spam detection is really important.

©Robert Wright/LDV Vision Summit

©Robert Wright/LDV Vision Summit

That's actually one of my favorite slides. It's not specifically about Twitter, but you know that back in the day Twitter would go down all the time, you know. What's going on, why is it always down? The interesting thing about Twitter is that it's not the language - Ruby isn't bad, Python isn't bad. What it is, Twitter kind of, should I say, “flew away” from the company that tried to build it. They just go so popular so quickly and it was very difficult to scale. Sometimes it takes luck, persistence, people working more than nine to five, twenty four hours several days just to get things up and running in the right kind of way.

Eighty percent of the time, you just don't get it right. This is mostly about the operations teams. A lot of these startups, when they hire their ops teams, they don't claim responsibility and, more often than not, it's because they're a disenfranchised group. Most of the people who call the shots are their founders, and operations folks are just trying to run things, but aren't really on the forefront of the company business.


Sometimes it takes luck, persistence, people working more than nine to five, twenty four hours several days just to get things up and running in the right kind of way.


Postmortems are very important. When things break, you do need to talk about them and you need to have your peers discuss them, but more often than not you need to think about the future. What can possibly go wrong? That's a premortem. So premortem can help you to envision the possibility of different kinds of disasters that you generally don't think about. If you don't think about, then likely it's not going to work for you.

In the early years of Google, I had no backout plan. It's not that I like to live dangerously, I just really didn't know what I was doing. A backout plan is very important, right now at Nventify, the second company that I co-founded, it's very important for me to ask my engineers, "So you're going to make all those changes, do we know how to roll back?" Usually the answer I would get is, "Well, I know what I'm doing." More often than not, that's not the case and you need a backout plan.

How to scale. So that quad-copter actually seen at sea is pretty great, great picture of scale which actually does work. So when do you scale, and how? Interestingly, most of the blocks you need are already available on GitHub. Just go to GitHub, get your building blocks downloaded and tried by your engineers, there's no point rebuilding the whole thing. Especially if it's not your core business. So the company should really be focusing on the core business, and not necessarily inventing the new building blocks.

Surprisingly, you're a better engineer if you know how to use Google. You can find a lot of things have already been solved. So using Google is a skill that every engineer should have.

That's a very important slide here. So NIH stands for Not Invented Here. A lot of bigger companies, when they scale up, usually have one of the people who say, "Hey, we never want to buy anything. We don't want to get anything that's open source. We want to create everything locally so we know exactly what is going on with our libraries." That's often a fallacy, it actually is expensive and slows down your progress and ability to deliver product to end users.

©Robert Wright/LDV Vision Summit

©Robert Wright/LDV Vision Summit

I want to spend a few minutes talking about the future. Clearly, it's cheaper and cheaper to store your files. Likely all of the great companies that you see on this screen are competing against each other in the future, likely within five years the cost of storage will go down to zero and you will end up paying for something else.

The way I see that, especially with Google efforts of delivering Fiber and satellite connections everywhere -- and so is Facebook -- it's very likely that everybody is going to have free internet and essentially be plugged in. That's an interesting concept. So we talk about storage nodes and we talked about it at Google as well. Likely what we're going to see in between five to fifteen years are storage nodes that are actually self aware of driven by conditions and some by AI. This way if you go on the plane and have your data with you, it won't be on your phone but it might be following you from the terminal into the plane, load up on the plane, and all of your movies and what you have are right there.

I call it AI Powered Peer-to-Peer Storage. I know, it's kind of cool. There's more and more interesting technology being developed when it comes to consuming data, specifically I've seen some interesting car windshield glasses, there's talk at Google about contact lenses that can create a VR feeling right in your eyes rather than wearing anything but contact lenses. This is Google Glass, currently we use text to query for things and find things. It's very likely that, if it isn't Google Glass it'll be something similar, where the visual information will be used for searches.

This is maybe twenty to twenty five years from now, where advanced technology will give people the ability to record all of your experience from your visual cortex, your feeling of whatever you're touching, and eventually share this data between humans. It's not telepathy, but more like close-range communication mind-to-mind which will be possible with technology.

Augmented reality and VR will likely merge, and we're likely to see an absence of keyboards and just using our minds and hands to manipulate and interact with data.

Discount tickets available for the upcoming LDV Vision Summit until May 15.

VR and Mixed Reality Platforms Are a Paradigm Shift in Storytelling

Heather Raikes, Creative Director of 8ninths ©Robert Wright/LDV Vision Summit

Heather Raikes, Creative Director of 8ninths ©Robert Wright/LDV Vision Summit

Join us at the next annual LDV Vision Summit in NYC. Early bird tickets are on sale until April 30.  

Heather Raikes, is the Creative Director of 8ninths, and she spoke at our 2016 LDV Vision Summit about design patterns for evolving storytelling through virtual mixed reality technologies.

Storytelling is in our DNA, it's part of what makes us human. How we tell our stories shapes our culture, deeply affects how we understand ourselves and each other and engage with the world around us. I'd like to start with a macro view of some archetypal patterns that underscore the fundamentals of storytelling and contextual its evolution through emerging technologies.

The core construct of traditional storytelling is the linear narrative. The ancient art of the storyteller could be used as a starting point. Sitting around a campfire, an audience gathers usually in a circle and listens to the stories and songs of the storyteller. The temporal format is linear and continuous, and the storyteller is a clear and singular focal point for the experience. Theater offers an audience a more immersive experience of a story. Stagecraft evokes the narrative world. The audience identifies with actors portraying the story characters. The temporal experience is still linear and continuous, but the focal point is expanded from a single storyteller to the world of the stage.

The focal point is further expanded in film. The story is told from a montage of different perspectives. Temporal engagement is still linear, but the focal point shifts continuously and dramatically within the world of the screen. In television, the story becomes discontinuous and episodic. The focal point mimics film in the form of the montage within the world of the screen, but temporally the audience engages and disengages at will.

A more significant shift comes in the transition from analog to digital storytelling. Native digital storytelling is participatory and interactive, disrupting many of the tenets of classical storytelling. In gaming, the audience essentially becomes the protagonist, and their actions unfold the action of the story, which is experienced from a first person perspective.

When you follow someone on social media, you are a live witness to their story, which has no clear ending and is told from an infinite number of discrete focal points derived from their journey through life. You are presumably contributing your story to this forum as well. There becomes a merging between the story you are witnessing, the story you are telling, and the story you are living.


In virtual reality, the story you are experiencing or witnessing becomes perceptually indistinguishable from your reality. You are completely immersed in a virtual world, and on some level your neurosensory processing system believes that it is reality.

-Heather Raikes, Creative Director of 8ninths


The next paradigm shift in storytelling is currently arriving with the onset of virtual and mixed reality platforms. In virtual reality, the story you are experiencing or witnessing becomes perceptually indistinguishable from your reality. You are completely immersed in a virtual world, and on some level your neurosensory processing system believes that it is reality. Comparably but differently, in mixed reality the story integrates seamlessly with your physical environment and your immediate perceptual framework, again, merging your reality with the world of the story.

©Robert Wright/LDV Vision Summit

©Robert Wright/LDV Vision Summit

I'm currently creative director 8ninths, a virtual and mixed reality development studio based in Seattle. We're working in this space and applying VR and MR technologies not just to entertainment stories but also to business. We've found that design patterns are an important compass for our team, for our partners and clients, and for the community at large in figuring out what to make creatively of this brave new space. I'm going to give you a tip of the iceberg roadmap of our starting points in thinking about developing for these emerging media.

Virtual reality is currently in its launch phase and is presenting a spectrum of platforms ranging from high end desktop room scale VR with physical tracking systems, to mobile VR platforms, to affordable contraptions that you can snap any mobile phone into. Some of the design patterns that we're currently exploring and developing for VR include visual grammars for 360 storytelling, temporal structures and story rhythms that are native to VR, world to world transition techniques, spherical user interface design, spatialized audio-video composition techniques, virtual embodiment and iconographic representation of physical presence in virtual spaces, and syntax for virtual collaboration.

Mixed reality is currently pre-launch. Developer editions of Microsoft HoloLens and Meta are just starting to ship, and Magic Leap is still pre-developer release. 8ninths was one of seven companies selected worldwide to be part of an early access developer program for HoloLens, and we've been working with HoloLens since last fall. We did a major project with Citibank in their Innovation Lab exploring expanding information-based workflow into mixed reality. As part of that process, we created a document called the HoloLens Design Patterns that breaks down and looks at core building blocks of early holographic computing experiences.

In the interest of time, this is a five-minute talk, this story is really just beginning to unfold. That is a sincere statement. It's a really exciting time in history. We will continue to publish virtual and mixed reality design patterns at this URL. We invite you to be part of the conversation. Thank you.

©Robert Wright/LDV Vision Summit

©Robert Wright/LDV Vision Summit