/*This function takes in a the path and names of
64x128 pixel images, the size of the cell to be
used for calculation of hog features(which should
be 8x8 pixels, some modifications will have to be
done in the code for a different cell size, which
could be easily done once the reader understands
how the code works), a default block size of 2x2
cells has been considered and the window size
parameter should be 64x128 pixels (appropriate
modifications can be easily done for other say
64x80 pixel window size). All the training images
are expected to be stored at the same location and
the names of all the images are expected to be in
sequential order like a1.jpg, a2.jpg, a3.jpg ..
and so on or a(1).jpg, a(2).jpg, a(3).jpg ... The
explanation of all the parameters below will make
clear the usage of the function. The synopsis of
the function is as follows :
prefix : it should be the path of the images, along
with the prefix in the image name for
example if the present working directory is
/home/saurabh/hog/ and the images are in
/home/saurabh/hog/images/positive/ and are
named like pos1.jpg, pos2.jpg, pos3.jpg ....,
then the prefix parameter would be
"images/positive/pos" or if the images are
named like pos(1).jpg, pos(2).jpg,
pos(3).jpg ... instead, the prefix parameter
would be "images/positive/pos("
suffix : it is the part of the name of the image
files after the number for example for the
above examples it would be ".jpg" or ").jpg"
cell : it should be CvSize(8,8), appropriate changes
need to be made for other cell sizes
window : it should be CvSize(64,128), appropriate
changes need to be made for other window sizes
number_samples : it should be equal to the number of
training images, for example if the
training images are pos1.jpg, pos2.jpg
..... pos1216.jpg, then it should be
1216
start_index : it should be the start index of the images'
names for example for the above case it
should be 1 or if the images were named
like pos1000.jpg, pos1001.jpg, pos1002.jpg
.... pos2216.jpg, then it should be 1000
end_index : it should be the end index of the images'
name for example for the above cases it
should be 1216 or 2216
savexml : if you want to store the extracted features,
then you can pass to it the name of an xml
file to which they should be saved
normalization : the normalization scheme to be used for
computing the hog features, any of the
opencv schemes could be passed or -1
could be passed if no normalization is
to be done */
CvMat* train_64x128(char *prefix, char *suffix, CvSize cell,
CvSize window, int number_samples, int start_index,
int end_index, char *savexml = NULL, int canny = 0,
int block = 1, int normalization = 4)
{
char filename[50] = "\0", number[8];
int prefix_length;
prefix_length = strlen(prefix);
int bins = 9;
/* A default block size of 2x2 cells is considered */
int block_width = 2, block_height = 2;
/* Calculation of the length of a feature vector for
an image (64x128 pixels)*/
int feature_vector_length;
feature_vector_length = (((window.width -
cell.width * block_width)/ cell.width) + 1) *
(((window.height - cell.height * block_height)
/ cell.height) + 1) * 36;
/* Matrix to store the feature vectors for
all(number_samples) the training samples */
CvMat* training = cvCreateMat(number_samples,
feature_vector_length, CV_32FC1);
CvMat row;
CvMat* img_feature_vector;
IplImage** integrals;
int i = 0, j = 0;
printf("Beginning to extract HoG features from
positive images\n");
strcat(filename, prefix);
/* Loop to calculate hog features for each
image one by one */
for (i = start_index; i <= end_index; i++)
{
cvtInt(number, i);
strcat(filename, number);
strcat(filename, suffix);
IplImage* img = cvLoadImage(filename);
/* Calculation of the integral histogram for
fast calculation of hog features*/
integrals = calculateIntegralHOG(img);
cvGetRow(training, &row, j);
img_feature_vector
= calculateHOG_window(integrals, cvRect(0, 0,
window.width, window.height), normalization);
cvCopy(img_feature_vector, &row);
j++;
printf("%s\n", filename);
filename[prefix_length] = '\0';
for (int k = 0; k < 9; k++)
{
cvReleaseImage(&integrals[k]);
}
}
if (savexml != NULL)
{
cvSave(savexml, training);
}
return training;
}
/* This function is almost the same as
train_64x128(...), except the fact that it can
take as input images of bigger sizes and
generate multiple samples out of a single
image.
It takes 2 more parameters than
train_64x128(...), horizontal_scans and
vertical_scans to determine how many samples
are to be generated from the image. It
generates horizontal_scans x vertical_scans
number of samples. The meaning of rest of the
parameters is same.
For example for a window size of
64x128 pixels, if a 320x240 pixel image is
given input with horizontal_scans = 5 and
vertical scans = 2, then it will generate to
samples by considering windows in the image
with (x,y,width,height) as (0,0,64,128),
(64,0,64,128), (128,0,64,128), .....,
(0,112,64,128), (64,112,64,128) .....
(256,112,64,128)
The function takes non-overlapping windows
from the image except the last row and last
column, which could overlap with the second
last row or second last column. So the values
of horizontal_scans and vertical_scans passed
should be such that it is possible to perform
that many scans in a non-overlapping fashion
on the given image. For example horizontal_scans
= 5 and vertical_scans = 3 cannot be passed for
a 320x240 pixel image as that many vertical scans
are not possible for an image of height 240
pixels and window of height 128 pixels. */
CvMat* train_large(char *prefix, char *suffix,
CvSize cell, CvSize window, int number_images,
int horizontal_scans, int vertical_scans,
int start_index, int end_index,
char *savexml = NULL, int normalization = 4)
{
char filename[50] = "\0", number[8];
int prefix_length;
prefix_length = strlen(prefix);
int bins = 9;
/* A default block size of 2x2 cells is considered */
int block_width = 2, block_height = 2;
/* Calculation of the length of a feature vector for
an image (64x128 pixels)*/
int feature_vector_length;
feature_vector_length = (((window.width -
cell.width * block_width) / cell.width) + 1) *
(((window.height - cell.height * block_height)
/ cell.height) + 1) * 36;
/* Matrix to store the feature vectors for
all(number_samples) the training samples */
CvMat* training = cvCreateMat(number_images
* horizontal_scans * vertical_scans,
feature_vector_length, CV_32FC1);
CvMat row;
CvMat* img_feature_vector;
IplImage** integrals;
int i = 0, j = 0;
strcat(filename, prefix);
printf("Beginning to extract HoG features
from negative images\n");
/* Loop to calculate hog features for each
image one by one */
for (i = start_index; i <= end_index; i++)
{
cvtInt(number, i);
strcat(filename, number);
strcat(filename, suffix);
IplImage* img = cvLoadImage(filename);
integrals = calculateIntegralHOG(img);
for (int l = 0; l < vertical_scans - 1; l++)
{
for (int k = 0; k < horizontal_scans - 1; k++)
{
cvGetRow(training, &row, j);
img_feature_vector = calculateHOG_window(
integrals, cvRect(window.width * k,
window.height * l, window.width,
window.height), normalization);
cvCopy(img_feature_vector, &row);
j++;
}
cvGetRow(training, &row, j);
img_feature_vector = calculateHOG_window(
integrals, cvRect(img->width - window.width,
window.height * l, window.width,
window.height), normalization);
cvCopy(img_feature_vector, &row);
j++;
}
for (int k = 0; k < horizontal_scans - 1; k++)
{
cvGetRow(training, &row, j);
img_feature_vector = calculateHOG_window(
integrals, cvRect(window.width * k,
img->height - window.height, window.width,
window.height), normalization);
cvCopy(img_feature_vector, &row);
j++;
}
cvGetRow(training, &row, j);
img_feature_vector = calculateHOG_window(integrals,
cvRect(img->width - window.width, img->height -
window.height, window.width, window.height),
normalization);
cvCopy(img_feature_vector, &row);
j++;
printf("%s\n", filename);
filename[prefix_length] = '\0';
for (int k = 0; k < 9; k++)
{
cvReleaseImage(&integrals[k]);
}
cvReleaseImage(&img);
}
printf("%d negative samples created \n",
training->rows);
if (savexml != NULL)
{
cvSave(savexml, training);
printf("Negative samples saved as %s\n",
savexml);
}
return training;
}
/* This function trains a linear support vector
machine for object classification. The synopsis is
as follows :
pos_mat : pointer to CvMat containing hog feature
vectors for positive samples. This may be
NULL if the feature vectors are to be read
from an xml file
neg_mat : pointer to CvMat containing hog feature
vectors for negative samples. This may be
NULL if the feature vectors are to be read
from an xml file
savexml : The name of the xml file to which the learnt
svm model should be saved
pos_file: The name of the xml file from which feature
vectors for positive samples are to be read.
It may be NULL if feature vectors are passed
as pos_mat
neg_file: The name of the xml file from which feature
vectors for negative samples are to be read.
It may be NULL if feature vectors are passed
as neg_mat*/
void trainSVM(CvMat* pos_mat, CvMat* neg_mat, char *savexml,
char *pos_file = NULL, char *neg_file = NULL)
{
/* Read the feature vectors for positive samples */
if (pos_file != NULL)
{
printf("positive loading...\n");
pos_mat = (CvMat*) cvLoad(pos_file);
printf("positive loaded\n");
}
/* Read the feature vectors for negative samples */
if (neg_file != NULL)
{
neg_mat = (CvMat*) cvLoad(neg_file);
printf("negative loaded\n");
}
int n_positive, n_negative;
n_positive = pos_mat->rows;
n_negative = neg_mat->rows;
int feature_vector_length = pos_mat->cols;
int total_samples;
total_samples = n_positive + n_negative;
CvMat* trainData = cvCreateMat(total_samples,
feature_vector_length, CV_32FC1);
CvMat* trainClasses = cvCreateMat(total_samples,
1, CV_32FC1 );
CvMat trainData1, trainData2, trainClasses1,
trainClasses2;
printf("Number of positive Samples : %d\n",
pos_mat->rows);
/*Copy the positive feature vectors to training
data*/
cvGetRows(trainData, &trainData1, 0, n_positive);
cvCopy(pos_mat, &trainData1);
cvReleaseMat(&pos_mat);
/*Copy the negative feature vectors to training
data*/
cvGetRows(trainData, &trainData2, n_positive,
total_samples);
cvCopy(neg_mat, &trainData2);
cvReleaseMat(&neg_mat);
printf("Number of negative Samples : %d\n",
trainData2.rows);
/*Form the training classes for positive and
negative samples. Positive samples belong to class
1 and negative samples belong to class 2 */
cvGetRows(trainClasses, &trainClasses1, 0, n_positive);
cvSet(&trainClasses1, cvScalar(1));
cvGetRows(trainClasses, &trainClasses2, n_positive,
total_samples);
cvSet(&trainClasses2, cvScalar(2));
/* Train a linear support vector machine to learn from
the training data. The parameters may played and
experimented with to see their effects*/
CvSVM svm(trainData, trainClasses, 0, 0,
CvSVMParams(CvSVM::C_SVC, CvSVM::LINEAR, 0, 0, 0, 2,
0, 0, 0, cvTermCriteria(CV_TERMCRIT_EPS,0, 0.01)));
printf("SVM Training Complete!!\n");
/*Save the learnt model*/
if (savexml != NULL) {
svm.save(savexml);
}
cvReleaseMat(&trainClasses);
cvReleaseMat(&trainData);
}
I hope the comments were helpful to understand and use the code. To see how a large collection of files can be renamed to a sequential order which is required by this implementation refer here. Another way to read in the images of dataset could be to store the paths of all files in a text file and parse then parse the text file. I will follow up this post soon, describing how the learnt model can be used for actual detection of an object in an image.
looking forward to ur net post.o(∩_∩)o...
ReplyDeleteI can't load your svm model ?
ReplyDeletehey, u mean u cant save svm model file?
Deletei've error of svm file showing zero support vectors
please help me out.......
hi, i did more or less copy and past your code in my implementation, but there is compilation error in the opencv header.h files :
ReplyDelete1>svm_opencv.obj : error LNK2019: símbolo externo "public: virtual __thiscall CvSVM::~CvSVM(void)" (??1CvSVM@@UAE@XZ) sin resolver al que se hace referencia en la función _main
1>svm_opencv.obj : error LNK2019: símbolo externo "public: __thiscall CvSVM::CvSVM(struct CvMat const *,struct CvMat const *,struct CvMat const *,struct CvMat const *,struct CvSVMParams)" (??0CvSVM@@QAE@PBUCvMat@@000UCvSVMParams@@@Z) sin resolver al que se hace referencia en la función _main
1>svm_opencv.obj : error LNK2019: símbolo externo "public: __thiscall CvSVMParams::CvSVMParams(int,int,double,double,double,double,double,double,struct CvMat *,struct CvTermCriteria)" (??0CvSVMParams@@QAE@HHNNNNNNPAUCvMat@@UCvTermCriteria@@@Z) sin resolver al que se hace referencia en la función _main
What is the detection rate, and how long does it take to train a classifier ?
ReplyDeletemay me you haven't included "ml.h" and added machine learning library "ml.lib"
ReplyDeleteHey...I have my feature vectors stored in a text file instead of an xml. What modifications do I need to do? Is it possible to read in features from a text file or I have to save them as an xml??
ReplyDeleteI get the same model xml no matter what video/input images I take. Dont know why this happens.
ReplyDeleteyour code helps us a lot. I am very grateful about that.
ReplyDeleteHello,
ReplyDeleteI have realized your Projekt. How can i test, if svm works fine or not? what should be the output of the svm?
Hey,
ReplyDeletehow many Images need the classifier to detect people well?
hello,
ReplyDeleteI have tested your project with thre INRIA dataset.when SVM is tested(svm.predict),it have many false reject.how can we know if svm works fine or not? What is the detection rate? how can we evaluate the performance of your framework?
thanks.
Hello,
ReplyDeletefirst of all thanks for sharing your HOG training code, especially because OpenCV still has no HOG training implementation.
Is there any tool which converts the SVM trained by your code to a format which is accepted by the (in the meantime) implemented OpenCV HOG detector ?
The svm would take as input the feature vector for a 64x128 detection window and output 1 if it detects a pedestrian in the window and 0 otherwise.
ReplyDeleteA pretty large number of training images (2000+ positive and 10000+ negative) are needed for good detection.
To evaluate the performance of the framework, you can take need to have a dataset of both positive and negative pedestrian images. Then you can the use svm.predict innoreply-comment@blogger.com loop over all the positive images and get the false positive rate and similarly the false negative rate.
I am sorry, rite now there is no tool for that. Maybe I'll have a look at the format accepted by the OpenCV HOG detector and try to write code which could do that in near future. I'll post it here whenever I do so. Btw it would not be difficult to write a detector using the framework given here.
Hi
ReplyDeleteI have an issue with the code. i seem to get all values in the xml as 0 or NaN. On probing, it seems that the bin images in calculateIntegralHOG is all zeros. Is this just me or does anyone else face this problem. Saurabh, your help would be greatly appreciated.
Thanks
same probelm with me
DeleteHello,
ReplyDelete1) calculateIntegralHOG is working for me
2) @Saurabh:
You're right. You can use the train_64x128 function directly as the detector function.
So I think a modification will be sufficient:
- detecting target in the whole image (width x, height y)
- using multiple scales
- multithreaded(OpenMP)
I will have a look at the "detectMultiScale" function in the OpenCV HOG detector. They are also using loop unrolling and caching for optimizing performance.
MfG
Christian
I have got the learned svm model in a xml file. But how can I test it for an image. Can anyone please help me.
ReplyDeletehey,i've a problem while training svm
Deletei've created both positive and negative xml files..
but when i load into void trainsvm,
there is a error in npositive=pos_mat->rows;
can u help me out..
is there any problem with the xml file i created??
what' the rough size ofxml file u created??
thanx in advance
u can test it with svm.predict
ReplyDeleteBut if i'd like to test it in a large database,
ReplyDeletehow can we evaluate the performance of this detector?
And how can we plot the DET curves?
any idea?????
I have tried to run the svm-predict.cpp with svm.cpp and for test_file I have used heart_scale which comes with libsvm project. But the problem is still how can I use the model file I built with this code as a model file to the svm-predict. Data format is completely different. Could you just give me a hint how did you use this .xml model file with the svm-predict and finally how can I test hog features with this model in a new image? Thanks for all your effort.
ReplyDelete@Sourabh: I have the same problem with the xml. How can it be tested with svm.predict?
ReplyDeleteThis is a code sample to use svm.predict:
ReplyDeleteCvSVM svm = CvSVM ();
svm.load ("train_features.xml");
IplImage* img=cvLoadImage("example.png");
IplImage** integrals=calculateIntegralHOG(img);
CvMat* window_feature_vector=calculateHOG_window(integrals,cvRect(0,0,64,128),CV_L2);
float r=0; r=svm.predict(window_feature_vector);
std::cout<<"result:"<<r<<std::endl;
I think it would be help
Yahia Said
Lab IT 06 Monastir
Tunisia
This comment has been removed by the author.
DeleteHow can you get the position of the object in the image. Is there anyway you can fit a bounding box in.
ReplyDeleteHi, I'm working on your code for a week.
ReplyDeleteThanks for the code. It's very useful.
And at last it works!
But I think it needs a slight modification.
In the calculateIntegralHOG function,
xsobel and ysobel are IPL_DEPTH_16S matrix(16bits Integer),
float* ptr1 = (float*) (xsobel->imageData + y * (xsobel->widthStep));
float* ptr2 = (float*) (ysobel->imageData + y * (ysobel->widthStep));
should be
short* ptr1 = (short*) (xsobel->imageData + y * (xsobel->widthStep));
short* ptr2 = (short*) (ysobel->imageData + y * (ysobel->widthStep));
,I think.
So following access to ptr1[x] and ptr2[x] should be
(float)(ptr1[x]) and (float)(ptr2[x]) respectively.
I hope it will help you, thanks.
Hiroshi Kawaguchi
Thanks for the code. Do anybody know how to use this learning with the Opencv2.0 hog detector?
ReplyDeleteHello,
ReplyDeleteI encapsulated the code in a class CHOGFeatures and added a "detectMultiScale" function like the OpenCV HOG Detector.
I already did some OpenMP optimization.
In order to increase performance, I trained not just one but many SVMs with different block sizes, using them altogether as a cascade, starting with the biggest blocks.
At the moment searching a 640x480 image with a search window step size of (8,8) and a scale increase factor of 1.5, it takes about 4 seconds per image on a quad core @2.4 GHz
Has somebody some similar benchmark results ?
So the next step would be:
1) using AdaBoost to find the most informative blocks
2) optimizing the HOG integral image calculation which is very slow
MfG
Christian
Hi Christian,
ReplyDeleteI am having problem to adapt "detectMultiScale" with this code. Can I have a look at your code please. If you want to mail me,my email is "mdjrn@yahoo.com".
Thanks.
Jack Reynolds
Hi Everybody,
ReplyDeleteHas anybody seen the Opencv HOGgetDefaultPeopleDetector function lately? Any idea of the format of the data. If anyone please upload the detetion code from this learning it will be much appreciated. Thanks to all.
Hi,
ReplyDeletehas anybody developed a "detectMultiScale" code based on the above codes?
Hi,
ReplyDeleteI have some problem with the xml result. I am using INRIA person database for negative images and MIT pedestrian database for positive images. Maybe I did wrong something, because there are only 0 and .Nan number in output xml files. Please, help me.
Thanks. Yaroslav
Hi Yaroslav,
ReplyDeleteinstead of :
float* ptr1 = (float*) (xsobel->imageData + y * (xsobel->widthStep));
float* ptr2 = (float*) (ysobel->imageData + y * (ysobel->widthStep));
Try this:
uchar* ptr1 = (uchar*)(xsobel->imageData + y * (xsobel->widthStep));
uchar* ptr2 = (uchar*)(ysobel->imageData + y * (ysobel->widthStep));
Hi everybody,
ReplyDeleteI'm testing this code on INRIA person database, the problem is my xml file is corrupted, i got a lot of .NAN value. Did anyone find solution to that problem??
and how to use uchar * ptr, we need it as float after to compute values?
Any help will be apreciated !!
thank you.
Daemonhic
same problem here...have you fixed that??
DeleteHi everyone,
ReplyDeleteProblem fixed! Thank you.
IplImage* xsobel = cvCreateImage(cvGetSize(img_gray), IPL_DEPTH_16S,1);
IplImage* ysobel = cvCreateImage(cvGetSize(img_gray), IPL_DEPTH_16S,1);
cvSobel(img_gray, xsobel, 1, 0, 3);
cvSobel(img_gray, ysobel, 0, 1, 3);
and then:
short* ptr1 = (short*) (xsobel->imageData + y * (xsobel->widthStep));
short* ptr2 = (short*) (ysobel->imageData + y * (ysobel->widthStep));
Daemonhic
Hi,
ReplyDeleteCould anyone of those who did multiscale detection code share it with us?
Thank you.
is it possible for me to use this technique for vehicles detection? i would like to train and track the "shape" of vehicle
ReplyDeleteHello! thank you very much for the code and the explanations.
ReplyDeleteI implemented the code and ran a testing but I keep getting the same result (r=2) when I use: r=svm.predict(window_feature_vector), even if I have different positive/negative training sets;
do anybody else have the same problem? any ideas of what's wrong?
hi there, the calculation of integral image fails when i process large number of training images. The error is in this line:
ReplyDeleteif (ptr1[x] == 0){
temp_gradient = ((atan(ptr2[x] / (ptr1[x] + 0.00001))) * (180/ PI)) + 90; }
any1 know what happen?
Hi, can you please give me a link to download Opencv3.0 to test this code (it doesn't work on Opencv2.1)?
ReplyDeleteIf you define xsoble and ysoble as IPL_DEPTH_16S,
ReplyDeleteyou have to convert the data type to 32F when you use atan, otherwise the result will not be correct.
The author mentioned that he can achieve 5 frames per second, may I know is this multi-scale scanning across the image frame? My current exhaustive search requires about 1 mins for one image frame of size 352x288.
ReplyDeleteI always get r = 2 when using smv.predict().. Does anybody know why ?
ReplyDeleteplease tell me what is the function of cvtInt?
ReplyDeleteHi. Thanks for the code. I tried with only a few training images (40 positive, 50 negative) and it works great.
ReplyDeleteI'm trying to make it work with OpenCV's HOG descriptor. Your work keeps the trained data in a CvMat and saves it in an xml file. How do I convert the data to vector.... I think this is the data type used by the defaultpeopledetector.
Thank you.
I've been trying to do this but always I'm going anywhere, did you get any results?
Deletevery Nice Info Your Shared here ........... keep sharing this wonderful stuff here .....................
ReplyDeleteKhurram Shahzad
Alaziz Online
very true bro.... did u implemented it....
Deleteif yes... tell me the details.... of how to implement....?