Wednesday, 8 January 2020

Object Detection with Sliding Window


Take an input image and output:

  • predictions of bounding boxes (each box contains an object)
  • class scores for objects within bounding boxes


Turn this into a pure classification problem. Classification outputs only the class score for the entire image. So the idea here is that we'll take different crops from the input image, one by one and feed them through our previously trained convolutional network which does a classification decision on that input crop. Classifier is run at evenly spaced locations over the entire image.

In addition to object labels we'll have also a background as classification category. Now our network can predict background in case it doesn't see any of the categories that we care about.

Sliding Windows.
Original image of animals taken from

So we have a rectangular "window" which slides across the input image and classifier outputs prediction only for this crop visible through that window. Window can take various sizes and aspect ratios and it can move in small or longer steps (strides) so for some crops classifier will output higher scores for some classes.


Image --> [ Sliding Window cropping --> crop --> Classifier --> class scores ]

Process within angle brackets has to be repeated as many times as many crops we'll use.


Because there could be any number of objects in this image, objects could appear at any location, at any size, at any aspect ratio in the image so if you want to do kind of a brute force sliding window approach you'd end up having to test many different crops.

And in the case where every one of those crops is going to be fed through a giant convolutional network, this would be completely computationally intractable. So in practice people don't ever do this sort of brute force sliding window approach for object detection using convolutional networks.

There are two main approaches which try to improve on Sliding Window.

One family of detectors is trying to reduce number of crops by proposing Regions of Interest (Region-proposal detectors). They still perform classification sequentially on each RoI.

Another approach is using a single pass of the image through CNN (Single-shot detectors). OverFeat is an example of such detector.


Lecture 11 | Detection and Segmentation - YouTube


Shayzee said...

Hello Everyone !

USA SSN Leads/Fullz available, along with Driving License/ID Number with good connectivity.

All SSN's are Tested & Verified.



*Price for SSN lead $2
*You can ask for sample before any deal
*If you buy in bulk, will give you discount
*Sampling is just for serious buyers

->Hope for the long term business
->You can buy for your specific states too

**Contact 24/7**

Whatsapp > +923172721122

Email >

Telegram > @leadsupplier

ICQ > 752822040

Fixit said...

Hello all
am looking few years that some guys comes into the market
they called themselves hacker, carder or spammer they rip the
peoples with different ways and it’s a badly impact to real hacker
now situation is that peoples doesn’t believe that real hackers and carder scammer exists.
Anyone want to make deal with me any type am available but first
I‘ll show the proof that am real then make a deal like

Available Services

..Wire Bank Transfer all over the world

..Western Union Transfer all over the world

..Credit Cards (USA, UK, AUS, CAN, NZ)

..School Grade upgrade / remove Records

..Spamming Tool

..keyloggers / rats

..Social Media recovery

.. Teaching Hacking / spamming / carding (1/2 hours course)

discount for re-seller

Contact: 24/7