Make Computers See with SimpleCV — The Open Source Framework for Vision

At first this article may seem a little outside the scope of the standard type of human augmentation, or singularity articles you may read about, as it’s a little more modern day application than future tense. Many people envision the singularity with different outcomes, some think humans will merge with machines, or possibly destroy itself with a super virus, but until it happens, then all of these are just predictions based on an intelligent hypothesis. To be honest I have no idea what the future entails, but I do know that I want to help build it, which is where SimpleCV comes in.

As we’ve seen over the course of the last 50 years, technology can make major impacts to society. With the advent of computers, it has now become possible for us to model real world problems in a virtual domain, and decrease the amount of time, cost, and complexity it takes to model these applications. What’s most important for any type of modeling application, or scientific process for that matter, is the data, data acquisition, and data quality. This where vision systems will play a key role in upon bringing the data acquisition part of the singularity.

Vision systems allow us to not only capture data, but they can capture data in a way that is smarter than many other types of sensors in existence today. They can provide context. An example of this would be an alarm system vs. a security camera. The alarm system would only notify you a break in has occurred, while a security camera can give you information about the assailant, how they broke in, how long it took them to break in, etc. With vision systems, they are literally able to autonomously convert real world data, into virtual data that the computer can then process and model, to help predict real world solutions to problems. There are many applications where an extra “set of eyes” would be extremely useful, or something that is currently done by a human could easily be done by a machine.

As it stands today there are already quite a few vision system manufactures for vision systems world wide. Although many of these systems are plagued with problems such as high cost, technical difficulty, licensing rights, scalability, etc. This is a strange paradigm considering that you may be reading this article on a device that already has a low cost vision sensor built inside of it that is already able to perform many types of applications. One example of existing real world application is scanning a paper check with your mobile phone. Another would be if you were in an unknown city and using google goggles to take a picture of a building and get all the information about it. One of the probably most mainstream in uses now-a-days is QR codes that many products have printed on their labels. Or even more recently, google getting the law passed so autonomous vehicles can drive in the state of Nevada. As you can see, these applications are sneaking into our everyday lives without much notice.

As vision systems gain momentum, so will the number of developers and consumers of these devices. One problem that exists with current vision systems is that they are proprietary. This is a problem because it raises the barrier to entry, which further causes potential scaling problems. Sure your vision software works well for a particular application, but may not work well for general vision applications. A topic that is touched upon a lot amongst the singularity crowd, is Artificial Intelligence vs. Artificial General Intelligence. This could be used a comparison of the current state of vision systems vs. what we are trying to build with SimpleCV, a more general computer vision approach.

With an open source model, rapid iterative development cycles typically happen much faster than under a proprietary development model. A good example of this is Microsoft Windows, where a release comes out about every 5 to 10 years, whereas Ubuntu Linux has 6 month release cycles. The barrier to entry is lower also, where the limitation is no longer the cost of the technology but the time and education to learn it. This model has been proven numerous times with the advent of the Internet. Many large companies like Facebook, Google, etc. all use open source software in their business model as they have found the rapid deployment and quick adaptability.

So after all that you are probably asking, “What is SimpleCV?” It is an open source computer vision framework that lowers the barriers to entry for people to learn, develop, and use it across the globe. Currently there are a few open source vision system libraries in existence, but the downside to these is you have to be quite the domain expert and knowledgeable with vision systems as well as know cryptic programming languages like C. Where SimpleCV is different, is it is “simple”. It has been designed with a web browser interface, which is familiar to Internet users everywhere. It will talk to your webcam (which most computers and smart phones have built in) automatically. It works cross platform (Windows, Mac, Linux, etc). It uses the programming language Python rather than C to greatly lower the learning curve of the software. It sacrifices some complexity for simplicity, which is needed for mass adoption of any type of new technology.

The applications for vision systems are almost limitless. We’ve included many type of examples included with the framework. A good example that comes with the software is a face detection demo. With SimpleCV it is only 15 lines of code and runs in a web browser. Other open source libraries may take 100’s of lines of code to accomplish the same thing and may require it’s own interface (though SimpleCV has that advanced functionality as well for those that want it).

Another example is using a quarter for scale, where you can put a quarter in front the camera with something else, take the picture, the quarter is then used as a reference to measure other objects in the scene. Another example is a motion detection for security. There are plenty more included as well.

Another topic to bring up that I haven’t seen any other vision libraries do is talk “over the cloud”. We decoupled the image acquisition from the image processing. We are able to use low cost ($30) webcam’s that are basically dumb terminals that sit on a network and push data up to the Internet. Since this technology can work in the the cloud, it means that you can start getting “zero installation” type of systems. Some of our demos showcase this, like our simple animation station. You can open a web page, start taking pictures and processing. And at the time of this article, we are working on a web based security camera that also works in your browser and will e-mail you the picture if motion is detected, and all you have to do is visit a web page. Pretty simple if you ask me.

Personally I believe in the next 15 years we will see an explosion in vision sensors. These sensors will be used from everything from controlling user interfaces, monitoring traffic flow, national security, autonomous vehicles, manufacturing, quality control, etc. The possibilities are limitless, as there are many horizontal and vertical market segments yet untapped by vision. They have the potential to be much bigger than the mobile market, mainly because mobile devices typically require human interaction to be useful, a vision system on the other hand can perform it’s task completely autonomously. For those that have the “vision for vision”, we are
always seeking those interested in contributing their ideas, thoughts, and talents.

SimpleCV is freely available to demo, download, hack, or modify at: http://www.simplecv.org

Anthony Oliver
Chief Technology Officer
Ingenuitas, Inc.
Ann Arbor, MI 48108
906-289-8169
anthony@ingenuitas.com
http://www.ingenuitas.com

5 Comments

  1. This could be used to link up several cameras and act like a security watch – sending alerts whenever strange activity is recorded. I’ve seen ways to setup a camera as a motion detector but I think making it recognize who is entering your apartment and sending alerts is where this can really help. Without these features just having a camera record footage of a robbery is pretty pointless.

  2. @EMS: We in fact do wrapper OpenCV. We aren’t trying to replace OpenCV by any means. We are well aware of what the willow garage guys are doing, in fact our founder was out the a couple weeks ago, and I was chatting with them at NY Makerfaire this past weekend. They are doing some awesome things, we are only looking to make some of the awesome things they are doing easier to use.

    OpenCV is more of a library, where SimpleCV is more of a framework. OpenCV is C/C++ based which is probably one of the most difficult languages to learn for a beginner, where as python (SimpleCV is based on), is much more intuitive. Python also plays nicer with many other frameworks out there as many people have written python wrappers.

    Another example is the setup. We have 1-click installs for all platforms. OpenCV you do not. There isn’t a shell that can hold your hand, you have to either setup visual studio, eclipse, etc. Much time would be spent trying to figure out how to just get your system configured, with SimpleCV, you click, fire up the terminal and start walking through built in examples we have created.

    Here is a perfect example of why we are creating it.
    Here is how you just acquire and show an image in OpenCV:

    cvNamedWindow( "My Window", 1 );
    IplImage *img = cvCreateImage( cvSize( 640, 480 ), IPL_DEPTH_8U, 1 );
    CvFont font;
    double hScale = 1.0;
    double vScale = 1.0;
    int lineWidth = 1;
    cvInitFont( &font, CV_FONT_HERSHEY_SIMPLEX | CV_FONT_ITALIC,
    hScale, vScale, 0, lineWidth );
    cvPutText( img, "Hello World!", cvPoint( 200, 400 ), &font,
    cvScalar( 255, 255, 0 ) );
    cvShowImage( "My Window", img );
    cvWaitKey();
    return 0;

    —————————
    Here’s how to do it in SimpleCV:

    img = Image("/path/to/img.png")
    img.drawText("Hello World!", 200, 400)
    img.show()

    —————————-

    There is no reason to bash on SimpleCV. We are just trying to make it easier to use for the average person. As I mentioned in my article the vision space is new and growing, there is plenty of space out there for both to exist. You can think of it as jQuery or the Arduino for computer vision. It is open source, so if there is anything you want added, fixed, or feel we are doing wrong, feel free to fork it on github (http://github.com/ingenuitas/simplecv) add your contributions and we will merge it into the main framework, that is much more helpful to the vision system community as a whole than just complaining. If you are a vision grad student then there is probably a lot you can help bring to the table, and I welcome any type of contributions you want to make to the framework.

  3. EMS: did you even bother clicking on the link? It is a wrapper for OpenCV, among other things. Of course OpenCV offers similar features.

  4. Congrats for your work at SimpleCV. I am a believer that any investment in the direction of improving technology, especially computer vision and machine learning are very welcome. Nothing bad can arise from a competition between SimpleCV and OpenCV. Keep on the good work!

  5. I think that the Willow Garage maintained OpenCV project offers many of the capabilities that you assert are unique to SimpleCV. As a graduate student in computer vision and sensory perception, I definitely prefer to resort to OpenCV over SimpleCV. I think efforts would be better used if directed at OpenCV. For example, OpenCV could benefit from more exposure to its machine learning ability and exposing feature computations (like histogram of gradients and other descriptors). Just that one contribution to OpenCV would be more meaningful than the sum total of SimpleCV thus far.

Leave a Reply