Tuesday, 7 January 2020

Object Localization (Classification with Localization)


Predict what is the main subject of the image and its location.


  • image
  • list of labels (categories/classes)


  • prediction of the class of the main subject of the image
  • prediction of the position of that object in the image (its bounding box - a minimal rectangle that completely contains it)


Traditional: feature detection (HOG, Haar-like, ...) + classification (SVM,...)

CNN: Feature extraction + classification and bounding box prediction (regression).
           Model learns  both class and location.


Typical architecture:

CNN where feature vector is fully connected to softmax layer (classifier - outputs class probabilities) and to 4-node layer (regressor - outputs bounding box coordinates and dimensions):

  • input layer
  • DNN (e.g. AlexNet)
  • feature vector (the output of convolution part of the network which summarizes the content of the image); 4096 nodes
  • fully connected layer that outputs class scores; connects 4096 feature vector nodes with e.g. 1000 nodes for each class; classification problem.
  • another fully connected layer that outputs bounding box coordinates: connects 4096 nodes of feature vector layer with 4 nodes (height, width and coordinates of the center) in the Box Coordinates layer; treats localization as regression problem
    Fully supervised setting: for each image we have annotated ground truth (correct) label and box coordinates.

    Loss Function

    During training (backpropagation) phase, if assuming fully supervised setting, we have two losses:
    • one for the predicted category, the one which describes difference between correct label and predicted class scores: Softmax Loss (this is actually a cross-entropy loss, which is standard loss function for Softmax layer [(28) Is the softmax loss the same as the cross-entropy loss? - Quora])
    • another one for the predicted box coordinates; L2 (Least Square Errors)  Loss - gives a measure of dissimilarity between predicted and ground truth bounding box [What Are L1 and L2 Loss Functions?]
    • total loss function is multi-task loss: weighted sum of these two losses

      Human Pose Estimation

      This idea of predicting the fixed number of positions in the image is also applied to Human Pose Estimation:

      • input: person in the image
      • output: position/coordinates of the joints (e.g. 14 joints: left/right foot, knee, hip, shoulder, elbow, hand; neck, head top)


      Stanford University School of Engineering: Fei-Fei Li, Justin Johnson, Serena Yeung: Convolutional Neural Networks for Visual Recognition: Lecture 11 | Detection and Segmentation. Link:
      Lecture 11 | Detection and Segmentation - YouTube


      Shayzee said...

      Hello Everyone !

      USA SSN Leads/Fullz available, along with Driving License/ID Number with good connectivity.

      All SSN's are Tested & Verified.


      ->FULL NAME

      *Price for SSN lead $2
      *You can ask for sample before any deal
      *If you buy in bulk, will give you discount
      *Sampling is just for serious buyers

      ->Hope for the long term business
      ->You can buy for your specific states too

      **Contact 24/7**

      Whatsapp > +923172721122

      Email > leads.sellers1212@gmail.com

      Telegram > @leadsupplier

      ICQ > 752822040

      Fixit said...

      Hello all
      am looking few years that some guys comes into the market
      they called themselves hacker, carder or spammer they rip the
      peoples with different ways and it’s a badly impact to real hacker
      now situation is that peoples doesn’t believe that real hackers and carder scammer exists.
      Anyone want to make deal with me any type am available but first
      I‘ll show the proof that am real then make a deal like

      Available Services

      ..Wire Bank Transfer all over the world

      ..Western Union Transfer all over the world

      ..Credit Cards (USA, UK, AUS, CAN, NZ)

      ..School Grade upgrade / remove Records

      ..Spamming Tool

      ..keyloggers / rats

      ..Social Media recovery

      .. Teaching Hacking / spamming / carding (1/2 hours course)

      discount for re-seller

      Contact: 24/7