|
47 | 47 | "\n", |
48 | 48 | "#### Prerequisites\n", |
49 | 49 | "To run this notebook, you can simply execute each cell in order. To understand what's happening, you'll need:\n", |
50 | | - "* An S3 bucket you can write to -- please provide its name in the following cell. The bucket must be in the same region as this SageMaker Notebook instance. You can also change the `EXP_NAME` to any valid S3 prefix. All the files related to this experiment will be stored in that prefix of your bucket. <mark>IMPORTANT: Your S3 bucket must allow public ACL access. This was the default S3 behavior until 11/2018, but not anymore. To enable public ACL access, [follow these AWS instructions](https://docs.aws.amazon.com/AmazonS3/latest/user-guide/block-public-access-bucket.html) and **unmark** all the checkboxes in Step 5.</mark>\n", |
| 50 | + "* An S3 bucket you can write to -- please provide its name in the following cell. The bucket must be in the same region as this SageMaker Notebook instance. You can also change the `EXP_NAME` to any valid S3 prefix. All the files related to this experiment will be stored in that prefix of your bucket. \n", |
51 | 51 | "* Familiarity with Python and [numpy](http://www.numpy.org/).\n", |
52 | 52 | "* Basic familiarity with [AWS S3](https://docs.aws.amazon.com/s3/index.html).\n", |
53 | 53 | "* Basic understanding of [AWS Sagemaker](https://aws.amazon.com/sagemaker/).\n", |
|
69 | 69 | "from collections import Counter\n", |
70 | 70 | "from datetime import datetime\n", |
71 | 71 | "import itertools\n", |
| 72 | + "import base64\n", |
72 | 73 | "import glob\n", |
73 | 74 | "import json\n", |
74 | 75 | "import random\n", |
|
101 | 102 | "region = boto3.session.Session().region_name\n", |
102 | 103 | "s3 = boto3.client('s3')\n", |
103 | 104 | "bucket_region = s3.head_bucket(Bucket=BUCKET)['ResponseMetadata']['HTTPHeaders']['x-amz-bucket-region']\n", |
104 | | - "assert bucket_region == region, \"Your S3 bucket {} and this notebook need to be in the same region.\".format(BUCKET)\n", |
105 | | - "\n", |
106 | | - "# Test that the bucket allows public-read files.\n", |
107 | | - "!echo \"test\" > test\n", |
108 | | - "!aws s3 cp test s3://{BUCKET}/test\n", |
109 | | - "try:\n", |
110 | | - " s3.put_object_acl(\n", |
111 | | - " ACL='public-read',\n", |
112 | | - " Bucket=BUCKET,\n", |
113 | | - " Key=f'test')\n", |
114 | | - "except botocore.exceptions.ClientError:\n", |
115 | | - " print('\\n\\n!!!!!!!!!! READ THIS !!!!!!!!\\n'\n", |
116 | | - " 'Your bucket has wrong permissions. Please read the instructions for these cells carefully and adjust your bucket permissions as described.'\n", |
117 | | - " ' Otherwise, we will be unable to upload an instruction template that is readable by the annotators to your bucket.')\n", |
118 | | - " raise" |
| 105 | + "assert bucket_region == region, \"Your S3 bucket {} and this notebook need to be in the same region.\".format(BUCKET)" |
119 | 106 | ] |
120 | 107 | }, |
121 | 108 | { |
|
283 | 270 | "metadata": {}, |
284 | 271 | "outputs": [], |
285 | 272 | "source": [ |
286 | | - "# Plot 6 samples in the given class.\n", |
| 273 | + "# Plot sample images.\n", |
287 | 274 | "def plot_bbs(ax, bbs, img):\n", |
288 | 275 | " '''Add bounding boxes to images.'''\n", |
289 | 276 | " ax.imshow(img)\n", |
|
297 | 284 | " rec = plt.Rectangle((xmin, ymin), xmax-xmin, ymax-ymin, fill=None, lw=4, edgecolor='blue')\n", |
298 | 285 | " ax.add_patch(rec)\n", |
299 | 286 | " \n", |
300 | | - "plt.figure(facecolor='white', dpi=100, figsize=(3, 9))\n", |
| 287 | + "plt.figure(facecolor='white', dpi=100, figsize=(3, 7))\n", |
301 | 288 | "plt.suptitle('Please draw a box\\n around each {}\\n like the examples below.\\n Thank you!'.format(CLASS_NAME), fontsize=15)\n", |
302 | | - "for fid_id, (fid, bbs) in enumerate([list(fids2bbs.items())[idx] for idx in [1, 3, 4]]):\n", |
| 289 | + "for fid_id, (fid, bbs) in enumerate([list(fids2bbs.items())[idx] for idx in [1, 3]]):\n", |
303 | 290 | " !aws s3 cp s3://open-images-dataset/test/{fid}.jpg .\n", |
304 | 291 | " img = imageio.imread(fid + '.jpg')\n", |
305 | 292 | " bbs = [[float(a) for a in annot[1:]] for annot in bbs]\n", |
306 | | - " ax = plt.subplot(3, 1, fid_id+1)\n", |
| 293 | + " ax = plt.subplot(2, 1, fid_id+1)\n", |
307 | 294 | " plot_bbs(ax, bbs, img)\n", |
308 | 295 | " plt.axis('off')\n", |
309 | 296 | " \n", |
310 | | - "plt.savefig('instructions.png', dpi=200)\n", |
311 | | - "!aws s3 cp instructions.png s3://{BUCKET}/{EXP_NAME}/instructions.png\n", |
312 | | - "try:\n", |
313 | | - " s3.put_object_acl(\n", |
314 | | - " ACL='public-read',\n", |
315 | | - " Bucket=BUCKET,\n", |
316 | | - " Key=f'{EXP_NAME}/instructions.png')\n", |
317 | | - "except botocore.exceptions.ClientError:\n", |
318 | | - " print('\\n\\n!!!!!!!!!! READ THIS !!!!!!!!\\n'\n", |
319 | | - " 'Could not make the instructions file public-readable in your S3 bucket. Annotators will not be able to see the instructions.'\n", |
320 | | - " ' You must change your bucket access settings, as described at the beginning of this notebook (instructions for the first cell), '\n", |
321 | | - " ' and then rerun this cell before continuing.')\n", |
322 | | - " assert 1 == 0, 'Please change your bucket permissions'\n", |
323 | | - "\n", |
324 | | - "instructions_uri = 'https://s3.{}.amazonaws.com/{}/{}/instructions.png'.format(bucket_region, BUCKET, EXP_NAME)" |
| 297 | + "plt.savefig('instructions.png', dpi=60)\n", |
| 298 | + "with open('instructions.png', 'rb') as instructions:\n", |
| 299 | + " instructions_uri = base64.b64encode(instructions.read()).decode('utf-8').replace('\\n', '')" |
325 | 300 | ] |
326 | 301 | }, |
327 | 302 | { |
|
332 | 307 | "source": [ |
333 | 308 | "from IPython.core.display import HTML, display\n", |
334 | 309 | "\n", |
335 | | - "TEST_TEMPLATE = True\n", |
336 | 310 | "def make_template(test_template=False, save_fname='instructions.template'):\n", |
337 | 311 | " template = r\"\"\"<script src=\"https://assets.crowd.aws/crowd-html-elements.js\"></script>\n", |
338 | 312 | " <crowd-form>\n", |
|
358 | 332 | "\n", |
359 | 333 | " </full-instructions>\n", |
360 | 334 | " <short-instructions>\n", |
361 | | - " <img src=\"{instructions_uri}\" style=\"max-width:100%\">\n", |
| 335 | + " <img src=\"data:image/png;base64,{instructions_uri}\" style=\"max-width:100%\">\n", |
362 | 336 | " </short-instructions>\n", |
363 | 337 | " </crowd-bounding-box>\n", |
364 | 338 | " </crowd-form>\n", |
|
367 | 341 | " labels_str=str(CLASS_LIST) if test_template else '{{ task.input.labels | to_json | escape }}')\n", |
368 | 342 | " with open(save_fname, 'w') as f:\n", |
369 | 343 | " f.write(template)\n", |
370 | | - " if test_template is False:\n", |
371 | | - " print(template)\n", |
372 | 344 | "\n", |
373 | 345 | " \n", |
374 | 346 | "make_template(test_template=True, save_fname='instructions.html')\n", |
|
398 | 370 | "3. Enter the desired name for your private workteam.\n", |
399 | 371 | "4. Select \"Create a new Amazon Cognito user group\" and click \"Create private team.\"\n", |
400 | 372 | "5. The AWS Console should now return to `AWS Console > Amazon SageMaker > Labeling workforces`.\n", |
401 | | - "5. Click on \"Invite new workers\" in the \"Workers\" tab.\n", |
402 | | - "6. Enter your own email address in the \"Email addresses\" section and click \"Invite new workers.\"\n", |
403 | | - "7. Click on your newly created team under the \"Private teams\" tab.\n", |
404 | | - "8. Select the \"Workers\" tab and click \"Add workers to team.\"\n", |
405 | | - "9. Select your email and click \"Add workers to team.\"\n", |
406 | | - "10. The AWS Console should again return to `AWS Console > Amazon SageMaker > Labeling workforces`. Your newly created team should be visible under \"Private teams\". Next to it you will see an `ARN` which is a long string that looks like `arn:aws:sagemaker:region-name-123456:workteam/private-crowd/team-name`. Copy this ARN in the cell below.\n", |
407 | | - "11. You should get an email from `no-reply@verificationemail.com` that contains your workforce username and password. \n", |
408 | | - "12. In `AWS Console > Amazon SageMaker > Labeling workforces`, click on the URL in `Labeling portal sign-in URL`. Use the email/password combination from Step 11 to log in (you will be asked to create a new, non-default password).\n", |
| 373 | + "6. Click on \"Invite new workers\" in the \"Workers\" tab.\n", |
| 374 | + "7. Enter your own email address in the \"Email addresses\" section and click \"Invite new workers.\"\n", |
| 375 | + "8. Click on your newly created team under the \"Private teams\" tab.\n", |
| 376 | + "9. Select the \"Workers\" tab and click \"Add workers to team.\"\n", |
| 377 | + "10. Select your email and click \"Add workers to team.\"\n", |
| 378 | + "11. The AWS Console should again return to `AWS Console > Amazon SageMaker > Labeling workforces`. Your newly created team should be visible under \"Private teams\". Next to it you will see an `ARN` which is a long string that looks like `arn:aws:sagemaker:region-name-123456:workteam/private-crowd/team-name`. Copy this ARN into the cell below.\n", |
| 379 | + "12. You should get an email from `no-reply@verificationemail.com` that contains your workforce username and password. \n", |
| 380 | + "13. In `AWS Console > Amazon SageMaker > Labeling workforces > Private`, click on the URL under `Labeling portal sign-in URL`. Use the email/password combination from the previous step to log in (you will be asked to create a new, non-default password).\n", |
409 | 381 | "\n", |
410 | 382 | "That's it! This is your private worker's interface. When we create a verification task in [Verify your task using a private team](#Verify-your-task-using-a-private-team-[OPTIONAL]) below, your task should appear in this window. You can invite your colleagues to participate in the labeling job by clicking the \"Invite new workers\" button.\n", |
411 | 383 | "\n", |
|
497 | 469 | " \"PreHumanTaskLambdaArn\": prehuman_arn,\n", |
498 | 470 | " \"MaxConcurrentTaskCount\": 200, # 200 images will be sent at a time to the workteam.\n", |
499 | 471 | " \"NumberOfHumanWorkersPerDataObject\": 5, # We will obtain and consolidate 5 human annotations for each image.\n", |
500 | | - " \"TaskAvailabilityLifetimeInSeconds\": 21600, # Your worteam has 6 hours to complete all pending tasks.\n", |
| 472 | + " \"TaskAvailabilityLifetimeInSeconds\": 21600, # Your workteam has 6 hours to complete all pending tasks.\n", |
501 | 473 | " \"TaskDescription\": task_description,\n", |
502 | 474 | " \"TaskKeywords\": task_keywords,\n", |
503 | 475 | " \"TaskTimeLimitInSeconds\": 300, # Each image must be labeled within 5 minutes.\n", |
|
1965 | 1937 | "cell_type": "markdown", |
1966 | 1938 | "metadata": {}, |
1967 | 1939 | "source": [ |
1968 | | - "## Create Endpoint\n", |
| 1940 | + "### Create Endpoint\n", |
1969 | 1941 | "\n", |
1970 | 1942 | "The next cell creates an endpoint that can be validated and incorporated into production applications. This takes about 10 minutes to complete." |
1971 | 1943 | ] |
|
2005 | 1977 | ] |
2006 | 1978 | }, |
2007 | 1979 | { |
2008 | | - "cell_type": "code", |
2009 | | - "execution_count": null, |
| 1980 | + "cell_type": "markdown", |
2010 | 1981 | "metadata": {}, |
2011 | | - "outputs": [], |
2012 | 1982 | "source": [ |
2013 | | - "print('Endpoint creation ended with EndpointStatus = {}'.format(status))" |
| 1983 | + "### Perform inference\n", |
| 1984 | + "\n", |
| 1985 | + "The following cell transforms the image into the appropriate format for realtime prediction, submits the job, receives the prediction from the endpoint, and plots the result." |
2014 | 1986 | ] |
2015 | 1987 | }, |
2016 | 1988 | { |
|
2044 | 2016 | "cell_type": "markdown", |
2045 | 2017 | "metadata": {}, |
2046 | 2018 | "source": [ |
| 2019 | + "### Clean up\n", |
| 2020 | + "\n", |
2047 | 2021 | "Finally, let's clean up and delete this endpoint." |
2048 | 2022 | ] |
2049 | 2023 | }, |
|
0 commit comments