Image Recognition in Apps: Why and How to Use

Image Recognition in Apps: Why and How to Use

Image recognition technology is more and more often used in our everyday life. Companies and enterprises use it to solve various problems from security to client satisfaction investigation. The investments into products with image recognition function are promised to grow up to $ 39 billion by 2021. Here are just a few examples of how image recognition is used in different spheres:

Healthcare. Illness changes everybody’s looks, and not for the better. But there are diseases which alter appearance more than others. By comparison of the patient’s facial distortions with those which are preserved in the database, the doctors put more precise diagnosis and even define the intensity of pain the patient suffers from. Apps like AiCure record if the patient has taken medications. And Orcam MyEye is a real resque for people with poor eyesight: the app informs the user about everything it ‘sees’.

Computer vision detecting the consumption of medication. Image from www.ncbi.nlm.nih.gov

Tourism. Saw an amazing photo in the Instagram and want to dine in the same cafe? What about uploading the photo into the app which will bring you to that very place? Based on the Instagram photos, Jetpac City Guides service even creates the top lists for tourists with descriptions and reviews.

Transport. Image recognition is the function without which the driverless car won’t make even the smallest distance without an accident. But if your fleet is brought to action by human drivers, you can also make use of this machine learning technology. For example, in order to prevent the unauthorized use of cars or drivers’ inattentiveness

E-commerce. Here, image recognition is most widely applied. Online-shops eBuy and Bohoo help users to find «that very» item even when the client can’t remember its name. And such apps as Lookwish and NICE even let you virtually «try» clothes.

Do you believe it’s only for large and rich players? Indeed, the implementation of machine learning (which is the core of image recognition) may turn troublesome when introduced from scratch. Luckily, there are public libraries which allow using ready-made models when developing your products. Firebase ML Kit is one of them. Further on, we will tell you how it can be used in the app development.

What Firebase ML Kit Is and How It Works

ML Kit is SDK which allows using ready-made machine learning Google solutions for iOS and Android in a simple way. No matter whether you have an experience in machine learning application, you can realize necessary functions in several lines of code. And if you are an experienced programmer, you may download your own TensorFlow models.

ML Kit can work both online (in this case you get a free access to a much bigger database, but with a limited number of inquiries: only the first thousand of them will be free of charge) and offline. Such functions as text, barcode, and image recognition are available both online and offline. Landmark recognition (well-known buildings, rivers, streets, etc.) is available only online, and face recognition — only offline from the device.

How to Integrate Image Recognition into the App

As an example, let’s start with an offline text recognition:

>
private fun runTextSearchOnDevice(bitmap: Bitmap) {
        val image = FirebaseVisionImage.fromBitmap(bitmap)
        val textDetector = FirebaseVision.getInstance().onDeviceTextRecognizer
        textDetector.processImage(image).addOnSuccessListener {
            processTextSearchFromDevice(it)
        }.addOnFailureListener {
            Toast.makeText(this, it.toString(), Toast.LENGTH_SHORT).show()
        }
    }
>

In the `processTextSearchFromDevice` method, we process the data presented by Firebase. Each function corresponds with different types of models which can be found in the documentation. In our case, we have got the array of objects with coordinates of the text in the `Rect` object, and, of course, the text itself recognized by Firebase. For online use, the code is almost the same:

>
 val options = FirebaseVisionCloudTextRecognizerOptions.Builder()
           .setLanguageHints(resources.getStringArray(R.array.textRecognitionLanguages).toList())
                .build()
        val textDetector = FirebaseVision.getInstance().getCloudTextRecognizer(options)
>

Now, we  can add languages which presumably will be recognized by Firebase. As a result, we get the same array of objects. If you are planning to recognize the document, for convenience, you may use `getCloudDocumentTextRecognizer`.

For face recognition, the approach is almost the same

>
val image = FirebaseVisionImage.fromBitmap(bitmap)
        val faceDetector = FirebaseVision.getInstance().getVisionFaceDetector(getFaceDetectorOptions())
        faceDetector.detectInImage(image).addOnSuccessListener {...}
>

In the settings, we can mention if we need to recognize ‘landmarks’ (eyes, nose, ears, etc.), type (for us, speed or accuracy might be of importance), classification by categories, for example, only photos with open eyes or with a smile. Here is the example of settings:

>
FirebaseVisionFaceDetectorOptions.Builder()
            .setModeType(FirebaseVisionFaceDetectorOptions.ACCURATE_MODE)
            .setLandmarkType(FirebaseVisionFaceDetectorOptions.NO_LANDMARKS)
            .setClassificationType(FirebaseVisionFaceDetectorOptions.NO_CLASSIFICATIONS)
            .build()
>

As a result, Firebase will generate an array of objects with the coordinates of grouped faces and, if there was an approval, the ‘landmarks’ array with coordinates.

Finally, let’s discuss how photo images can be recognized. In general, everything is almost the same as in the previous cases. Here, for example, how it looks offline:

>
val labelDetector = FirebaseVision.getInstance().visionLabelDetector
        labelDetector.detectInImage(image).addOnSuccessListener {...}
>

And this is an online variant:

>
 val options = FirebaseVisionCloudDetectorOptions.Builder()
                .setModelType(FirebaseVisionCloudDetectorOptions.LATEST_MODEL)
                .setMaxResults(15)
                .build()
        val labelDetector = FirebaseVision.getInstance().getVisionCloudLabelDetector(options)
        labelDetector.detectInImage(image).addOnSuccessListener {...}
>

In the settings, you can mention the maximum of the results shown and the model which should be applied (the stable one or the latest available). As a result, you receive the array of objects with the name (depending on what Firebase ‘sees’ in the photo) and accuracy in the range between 0.01 and 1.0.

As you can see, there are no any complexities and for most cases free (local) ML Kit is enough.

Let’s Sum Up

Machine learning and image recognition, in particular, is not necessarily something out of the ordinary. For example, in your app, it can be used on the stage when the users add their photos when registering. All you need is to check whether the picture is appropriate — at least if it’s not the image of the user’s pet. And the client wants to start using your product as soon as possible and doesn’t want to wait for the photo to be approved by the moderator.

Image recognition implementation is not always costly. You may use, fully or partially, already existing solutions, Firebase ML Kit being one of them. we hope that the algorithm described will help you to do it without any extra efforts. and if you want to be totally sure in the result — address our team. We will readily help you with the development of web services and mobile apps.