DEV Community

Natalia D
Natalia D

Posted on

My machine-learning pet project. Part 3. Brushing up and labelling my dataset.

My pet-project is about food recognition. More info here.

Image description

Scrapy made me a folder with images and a .csv with rows like:

Apple Cake,"Some apple cake description...",https://www.some-recipes-website.ru/binfiles/images/20200109/m12b509e.jpg,"[{'url': 'https://www.some-recipes-website.ru/binfiles/images/20200109/m12b509e.jpg', 'path': 'full/ae00a78059ad08506aa4767ed925bef5dccabf63.jpg', 'checksum': '55088c744a564af5ed8d4e5ea6478d20', 'status': 'downloaded'}]" 
Enter fullscreen mode Exit fullscreen mode

Now, I needed to create .csv files like this:

pic-037.jpg,80,20,500,120,risotto pic-025.jpg,520,250,1152,953,risotto pic-with-nothing.jpg,,,,, pic-004.jpg,0,0,1600,1113,beans ... 
Enter fullscreen mode Exit fullscreen mode

To do that, I googled smth like "best machine learning label tools 2022" and found Label-studio. I followed these steps from their docs:

python3 -m venv env source env/bin/activate python -m pip install label-studio 
Enter fullscreen mode Exit fullscreen mode

But I couldn't launch label-studio until I did some of these things:

pip install wheel pip install spacy pip install cymem brew install postgresql 
Enter fullscreen mode Exit fullscreen mode

(link1, link2 might be helpful)

I didn't have any problems importing my data to Label-studio. The only setting I had to do was to add my labeling interface code:

<View> <Image name="image" value="$image"/> <Choices name="choice" toName="image" showInLine="true"> <Choice value="Salad" background="blue"/> <Choice value="Soup" background="green" /> <Choice value="Pastry" background="orange" /> <Choice value="Nothing" background="orange" /> </Choices> <RectangleLabels name="label" toName="image"> <Label value="Salad" background="green"/> <Label value="Soup" background="blue"/> <Label value="Pastry" background="orange"/> <Label value="Nothing" background="black"/> </RectangleLabels> </View> 
Enter fullscreen mode Exit fullscreen mode

After that, labeling interface looked like this:

Image description

I had label "Nothing" for confusing images that I decided to exclude from the dataset:

Image description

After I finished labeling a portion of images, I chose to export them in .csv format. I got a file with rows like this:

/data/upload/1/83d8ce57-7478f053119ca2a85c4932870ef3e1833eb3eeb5.jpg,14,Pastry,"[{""x"": 13.157894736842104, ""y"": 6.5625, ""width"": 41.578947368421055, ""height"": 74.375, ""rotation"": 0, ""rectanglelabels"": [""Pastry""], ""original_width"": 570, ""original_height"": 320}]",1,20,2022-06-06T07:11:56.377261Z,2022-06-10T21:02:35.758255Z,1209.599 
Enter fullscreen mode Exit fullscreen mode

I was surprised when I saw x, y, width and height. Then I've read in the docs that "Image annotations exported in JSON format use percentages of overall image size, not pixels, to describe the size and location of the bounding boxes."

I wrote a small python script to check exported regions:

from PIL import Image img = Image.open('../../images/full/7478f053119ca2a85c4932870ef3e1833eb3eeb5.jpg') x = 13.157894736842104 y = 6.5625 width = 41.578947368421055 height = 74.375 original_width = 570 original_height = 320 pixel_x = x / 100.0 * original_width pixel_y = y / 100.0 * original_height pixel_width = width / 100.0 * original_width pixel_height = height / 100.0 * original_height left = pixel_x upper = pixel_y right = pixel_x + pixel_width lower = pixel_y + pixel_height box = (left, upper, right, lower) region = img.crop(box) region.show() 
Enter fullscreen mode Exit fullscreen mode

When I launched the script, it showed me the correct region cropped out of the original image:

Image description

So now I understood how to get pixel annotations if I need them.

Next post is going to be about trying to feed the dataset to Retinanet. I am going to use only 48 images scraped so far just to see what format of input it really needs.

Top comments (0)