pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

URL: http://github.com/aws/sagemaker-python-sdk/pull/5704

al-0bd78641c0a1f3e0.css" /> Fix #5627: SFT example notebook references inaccessible S3 dataset URI by JiwaniZakir · Pull Request #5704 · aws/sagemaker-python-sdk · GitHub
Skip to content

Fix #5627: SFT example notebook references inaccessible S3 dataset URI#5704

Open
JiwaniZakir wants to merge 1 commit intoaws:masterfrom
JiwaniZakir:fix/5627-sft-example-notebook-references-inaccess
Open

Fix #5627: SFT example notebook references inaccessible S3 dataset URI#5704
JiwaniZakir wants to merge 1 commit intoaws:masterfrom
JiwaniZakir:fix/5627-sft-example-notebook-references-inaccess

Conversation

@JiwaniZakir
Copy link
Copy Markdown

Closes #5627

Motivation

The SFT finetuning example notebook hardcoded an internal S3 URI (s3://mc-flows-sdk-testing/...) that external users cannot access, causing an immediate 403 Forbidden error when running the dataset registration cell.

Changes

File: v3-examples/model-customization-examples/sft_finetuning_example_notebook_pysdk_prod_v3.ipynb

  • Dataset registration cell (~line 85): Replaced the hardcoded s3://mc-flows-sdk-testing/input_data/sft/sample_data_256_final.jsonl source with a named placeholder variable MY_DATASET_S3_URI = "s3://<your-bucket>/<path-to-your-dataset>.jsonl" marked with a # TODO comment. Added an explanatory comment block describing the required JSONL format (prompt/completion fields per line) and linking to the SageMaker SFT documentation.
  • Training job cell (~line 169): Replaced s3_output_path="s3://mc-flows-sdk-testing/output/" with "s3://<your-bucket>/output/" and a # TODO comment.
  • Second training job cell (~line 384): Same s3_output_path substitution as above.
  • Nova training job cell (~line 445): Replaced s3_output_path="s3://mc-flows-sdk-testing-us-east-1/output/" with the same placeholder pattern.

Testing

Manual verification: open the notebook and confirm no cells reference mc-flows-sdk-testing — all four occurrences are replaced with <your-bucket> placeholders. A user following the notebook will now see the TODO markers before executing any cell that requires S3 access, preventing the 403 error. Substituting a valid bucket and a JSONL file with prompt/completion fields allows the notebook to run end-to-end successfully.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SFT example notebook references inaccessible S3 dataset URI

1 participant

pFad - Phonifier reborn

Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.





Check this box to remove all script contents from the fetched content.



Check this box to remove all images from the fetched content.


Check this box to remove all CSS styles from the fetched content.


Check this box to keep images inefficiently compressed and original size.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy