• Explore possible neural architectures to handle the problem of car model identification.
  • Reproduce the results reported in the paper linked here.



Expected Results

  • Reproduce accuracy scores reported in the CompCars paper.
  • Deliver a trained neural network for car model identification.


1) Performance Optimization

The objective of the PoC In relation to performance is to show the feasibility of using the Intel® architecture in the process of Inference and training.


The metric to measure the feasibility of inference was defined according to the average execution time of the GPU inferences described in the reference paper.

  • In the reference paper the GPU obtained an average time for inference of 200 millisecond per image.
  • In Skylake we obtained an average time for inference of 118 millisecond per image. The graph below shows the average execution time by varying the increase in the number of inferences.


A gain of 3x was the same amount of training steps (16000), but using different databases. The chart below shows this comparison.

2) Accuracy results with the GoogLeNet cars model

We produced a sample of ~15K cropped images to evaluate the accuracy of the model.
In order to test the model, we take advantage of the caffe test utility and the resulting accuracy is reported bellow:

top-1 accuracy = 0.6814
top-5 accuracy = 0.8565

After removing the bounding box, we obtained accuracy values pretty much compatible with the technical report:

top-1 accuracy = 0.911
top-5 accuracy = 0.984

To verify the performance of the GoogLeNet cars model with new data, a fine-tuning experiment with the stanford dataset using 8,144 training images was carried out.

The resulting accuracy measured with 8,041 test images from the stanford dataset is reported bellow:

top-1 accuracy = 0.835
top-5 accuracy = 0.967

3) Using Wavelets to explore the full potential of Neural Networks