Skip to content

Commit ef88373

Browse files
committed
Embed images directly into worker instructions template.
To avoid issues with the altering permissions for the S3 bucket used in the Ground Truth Object Detection Tutorial, this commit embeds images directly into the MTurk instructions template. It also fixes some typos and extraneous cells in the in the notebook text.
1 parent c55d63d commit ef88373

File tree

1 file changed

+27
-53
lines changed

1 file changed

+27
-53
lines changed

ground_truth_labeling_jobs/ground_truth_object_detection_tutorial/object_detection_tutorial.ipynb

Lines changed: 27 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@
4747
"\n",
4848
"#### Prerequisites\n",
4949
"To run this notebook, you can simply execute each cell in order. To understand what's happening, you'll need:\n",
50-
"* An S3 bucket you can write to -- please provide its name in the following cell. The bucket must be in the same region as this SageMaker Notebook instance. You can also change the `EXP_NAME` to any valid S3 prefix. All the files related to this experiment will be stored in that prefix of your bucket. <mark>IMPORTANT: Your S3 bucket must allow public ACL access. This was the default S3 behavior until 11/2018, but not anymore. To enable public ACL access, [follow these AWS instructions](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/block-public-access-bucket.html) and **unmark** all the checkboxes in Step 5.</mark>\n",
50+
"* An S3 bucket you can write to -- please provide its name in the following cell. The bucket must be in the same region as this SageMaker Notebook instance. You can also change the `EXP_NAME` to any valid S3 prefix. All the files related to this experiment will be stored in that prefix of your bucket. \n",
5151
"* Familiarity with Python and [numpy](http://www.numpy.org/).\n",
5252
"* Basic familiarity with [AWS S3](https://docs.aws.amazon.com/s3/index.html).\n",
5353
"* Basic understanding of [AWS Sagemaker](https://aws.amazon.com/sagemaker/).\n",
@@ -69,6 +69,7 @@
6969
"from collections import Counter\n",
7070
"from datetime import datetime\n",
7171
"import itertools\n",
72+
"import base64\n",
7273
"import glob\n",
7374
"import json\n",
7475
"import random\n",
@@ -101,21 +102,7 @@
101102
"region = boto3.session.Session().region_name\n",
102103
"s3 = boto3.client('s3')\n",
103104
"bucket_region = s3.head_bucket(Bucket=BUCKET)['ResponseMetadata']['HTTPHeaders']['x-amz-bucket-region']\n",
104-
"assert bucket_region == region, \"Your S3 bucket {} and this notebook need to be in the same region.\".format(BUCKET)\n",
105-
"\n",
106-
"# Test that the bucket allows public-read files.\n",
107-
"!echo \"test\" > test\n",
108-
"!aws s3 cp test s3://{BUCKET}/test\n",
109-
"try:\n",
110-
" s3.put_object_acl(\n",
111-
" ACL='public-read',\n",
112-
" Bucket=BUCKET,\n",
113-
" Key=f'test')\n",
114-
"except botocore.exceptions.ClientError:\n",
115-
" print('\\n\\n!!!!!!!!!! READ THIS !!!!!!!!\\n'\n",
116-
" 'Your bucket has wrong permissions. Please read the instructions for these cells carefully and adjust your bucket permissions as described.'\n",
117-
" ' Otherwise, we will be unable to upload an instruction template that is readable by the annotators to your bucket.')\n",
118-
" raise"
105+
"assert bucket_region == region, \"Your S3 bucket {} and this notebook need to be in the same region.\".format(BUCKET)"
119106
]
120107
},
121108
{
@@ -283,7 +270,7 @@
283270
"metadata": {},
284271
"outputs": [],
285272
"source": [
286-
"# Plot 6 samples in the given class.\n",
273+
"# Plot sample images.\n",
287274
"def plot_bbs(ax, bbs, img):\n",
288275
" '''Add bounding boxes to images.'''\n",
289276
" ax.imshow(img)\n",
@@ -297,31 +284,19 @@
297284
" rec = plt.Rectangle((xmin, ymin), xmax-xmin, ymax-ymin, fill=None, lw=4, edgecolor='blue')\n",
298285
" ax.add_patch(rec)\n",
299286
" \n",
300-
"plt.figure(facecolor='white', dpi=100, figsize=(3, 9))\n",
287+
"plt.figure(facecolor='white', dpi=100, figsize=(3, 7))\n",
301288
"plt.suptitle('Please draw a box\\n around each {}\\n like the examples below.\\n Thank you!'.format(CLASS_NAME), fontsize=15)\n",
302-
"for fid_id, (fid, bbs) in enumerate([list(fids2bbs.items())[idx] for idx in [1, 3, 4]]):\n",
289+
"for fid_id, (fid, bbs) in enumerate([list(fids2bbs.items())[idx] for idx in [1, 3]]):\n",
303290
" !aws s3 cp s3://open-images-dataset/test/{fid}.jpg .\n",
304291
" img = imageio.imread(fid + '.jpg')\n",
305292
" bbs = [[float(a) for a in annot[1:]] for annot in bbs]\n",
306-
" ax = plt.subplot(3, 1, fid_id+1)\n",
293+
" ax = plt.subplot(2, 1, fid_id+1)\n",
307294
" plot_bbs(ax, bbs, img)\n",
308295
" plt.axis('off')\n",
309296
" \n",
310-
"plt.savefig('instructions.png', dpi=200)\n",
311-
"!aws s3 cp instructions.png s3://{BUCKET}/{EXP_NAME}/instructions.png\n",
312-
"try:\n",
313-
" s3.put_object_acl(\n",
314-
" ACL='public-read',\n",
315-
" Bucket=BUCKET,\n",
316-
" Key=f'{EXP_NAME}/instructions.png')\n",
317-
"except botocore.exceptions.ClientError:\n",
318-
" print('\\n\\n!!!!!!!!!! READ THIS !!!!!!!!\\n'\n",
319-
" 'Could not make the instructions file public-readable in your S3 bucket. Annotators will not be able to see the instructions.'\n",
320-
" ' You must change your bucket access settings, as described at the beginning of this notebook (instructions for the first cell), '\n",
321-
" ' and then rerun this cell before continuing.')\n",
322-
" assert 1 == 0, 'Please change your bucket permissions'\n",
323-
"\n",
324-
"instructions_uri = 'https://s3.{}.amazonaws.com/{}/{}/instructions.png'.format(bucket_region, BUCKET, EXP_NAME)"
297+
"plt.savefig('instructions.png', dpi=60)\n",
298+
"with open('instructions.png', 'rb') as instructions:\n",
299+
" instructions_uri = base64.b64encode(instructions.read()).decode('utf-8').replace('\\n', '')"
325300
]
326301
},
327302
{
@@ -332,7 +307,6 @@
332307
"source": [
333308
"from IPython.core.display import HTML, display\n",
334309
"\n",
335-
"TEST_TEMPLATE = True\n",
336310
"def make_template(test_template=False, save_fname='instructions.template'):\n",
337311
" template = r\"\"\"<script src=\"https://assets.crowd.aws/crowd-html-elements.js\"></script>\n",
338312
" <crowd-form>\n",
@@ -358,7 +332,7 @@
358332
"\n",
359333
" </full-instructions>\n",
360334
" <short-instructions>\n",
361-
" <img src=\"{instructions_uri}\" style=\"max-width:100%\">\n",
335+
" <img src=\"data:image/png;base64,{instructions_uri}\" style=\"max-width:100%\">\n",
362336
" </short-instructions>\n",
363337
" </crowd-bounding-box>\n",
364338
" </crowd-form>\n",
@@ -367,8 +341,6 @@
367341
" labels_str=str(CLASS_LIST) if test_template else '{{ task.input.labels | to_json | escape }}')\n",
368342
" with open(save_fname, 'w') as f:\n",
369343
" f.write(template)\n",
370-
" if test_template is False:\n",
371-
" print(template)\n",
372344
"\n",
373345
" \n",
374346
"make_template(test_template=True, save_fname='instructions.html')\n",
@@ -398,14 +370,14 @@
398370
"3. Enter the desired name for your private workteam.\n",
399371
"4. Select \"Create a new Amazon Cognito user group\" and click \"Create private team.\"\n",
400372
"5. The AWS Console should now return to `AWS Console > Amazon SageMaker > Labeling workforces`.\n",
401-
"5. Click on \"Invite new workers\" in the \"Workers\" tab.\n",
402-
"6. Enter your own email address in the \"Email addresses\" section and click \"Invite new workers.\"\n",
403-
"7. Click on your newly created team under the \"Private teams\" tab.\n",
404-
"8. Select the \"Workers\" tab and click \"Add workers to team.\"\n",
405-
"9. Select your email and click \"Add workers to team.\"\n",
406-
"10. The AWS Console should again return to `AWS Console > Amazon SageMaker > Labeling workforces`. Your newly created team should be visible under \"Private teams\". Next to it you will see an `ARN` which is a long string that looks like `arn:aws:sagemaker:region-name-123456:workteam/private-crowd/team-name`. Copy this ARN in the cell below.\n",
407-
"11. You should get an email from `no-reply@verificationemail.com` that contains your workforce username and password. \n",
408-
"12. In `AWS Console > Amazon SageMaker > Labeling workforces`, click on the URL in `Labeling portal sign-in URL`. Use the email/password combination from Step 11 to log in (you will be asked to create a new, non-default password).\n",
373+
"6. Click on \"Invite new workers\" in the \"Workers\" tab.\n",
374+
"7. Enter your own email address in the \"Email addresses\" section and click \"Invite new workers.\"\n",
375+
"8. Click on your newly created team under the \"Private teams\" tab.\n",
376+
"9. Select the \"Workers\" tab and click \"Add workers to team.\"\n",
377+
"10. Select your email and click \"Add workers to team.\"\n",
378+
"11. The AWS Console should again return to `AWS Console > Amazon SageMaker > Labeling workforces`. Your newly created team should be visible under \"Private teams\". Next to it you will see an `ARN` which is a long string that looks like `arn:aws:sagemaker:region-name-123456:workteam/private-crowd/team-name`. Copy this ARN into the cell below.\n",
379+
"12. You should get an email from `no-reply@verificationemail.com` that contains your workforce username and password. \n",
380+
"13. In `AWS Console > Amazon SageMaker > Labeling workforces > Private`, click on the URL under `Labeling portal sign-in URL`. Use the email/password combination from the previous step to log in (you will be asked to create a new, non-default password).\n",
409381
"\n",
410382
"That's it! This is your private worker's interface. When we create a verification task in [Verify your task using a private team](#Verify-your-task-using-a-private-team-[OPTIONAL]) below, your task should appear in this window. You can invite your colleagues to participate in the labeling job by clicking the \"Invite new workers\" button.\n",
411383
"\n",
@@ -497,7 +469,7 @@
497469
" \"PreHumanTaskLambdaArn\": prehuman_arn,\n",
498470
" \"MaxConcurrentTaskCount\": 200, # 200 images will be sent at a time to the workteam.\n",
499471
" \"NumberOfHumanWorkersPerDataObject\": 5, # We will obtain and consolidate 5 human annotations for each image.\n",
500-
" \"TaskAvailabilityLifetimeInSeconds\": 21600, # Your worteam has 6 hours to complete all pending tasks.\n",
472+
" \"TaskAvailabilityLifetimeInSeconds\": 21600, # Your workteam has 6 hours to complete all pending tasks.\n",
501473
" \"TaskDescription\": task_description,\n",
502474
" \"TaskKeywords\": task_keywords,\n",
503475
" \"TaskTimeLimitInSeconds\": 300, # Each image must be labeled within 5 minutes.\n",
@@ -1965,7 +1937,7 @@
19651937
"cell_type": "markdown",
19661938
"metadata": {},
19671939
"source": [
1968-
"## Create Endpoint\n",
1940+
"### Create Endpoint\n",
19691941
"\n",
19701942
"The next cell creates an endpoint that can be validated and incorporated into production applications. This takes about 10 minutes to complete."
19711943
]
@@ -2005,12 +1977,12 @@
20051977
]
20061978
},
20071979
{
2008-
"cell_type": "code",
2009-
"execution_count": null,
1980+
"cell_type": "markdown",
20101981
"metadata": {},
2011-
"outputs": [],
20121982
"source": [
2013-
"print('Endpoint creation ended with EndpointStatus = {}'.format(status))"
1983+
"### Perform inference\n",
1984+
"\n",
1985+
"The following cell transforms the image into the appropriate format for realtime prediction, submits the job, receives the prediction from the endpoint, and plots the result."
20141986
]
20151987
},
20161988
{
@@ -2044,6 +2016,8 @@
20442016
"cell_type": "markdown",
20452017
"metadata": {},
20462018
"source": [
2019+
"### Clean up\n",
2020+
"\n",
20472021
"Finally, let's clean up and delete this endpoint."
20482022
]
20492023
},

0 commit comments

Comments
 (0)