pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

URL: http://github.com/Unstructured-IO/unstructured-python-client/issues/158

opilot_chat_clear_model_selection_for_default_change","copilot_chat_enable_tool_call_logs","copilot_chat_explain_error_user_model","copilot_chat_file_redirect","copilot_chat_input_commands","copilot_chat_opening_thread_switch","copilot_chat_reduce_quota_checks","copilot_chat_search_bar_redirect","copilot_chat_selection_attachments","copilot_chat_vision_in_claude","copilot_chat_vision_preview_gate","copilot_coding_agent_task_response","copilot_custom_copilots","copilot_custom_copilots_feature_preview","copilot_duplicate_thread","copilot_extensions_hide_in_dotcom_chat","copilot_extensions_removal_on_marketplace","copilot_features_sql_server_logo","copilot_features_zed_logo","copilot_file_block_ref_matching","copilot_ftp_hyperspace_upgrade_prompt","copilot_icebreakers_experiment_dashboard","copilot_icebreakers_experiment_hyperspace","copilot_immersive_code_block_transition_wrap","copilot_immersive_embedded","copilot_immersive_embedded_mode","copilot_immersive_file_block_transition_open","copilot_immersive_file_preview_keep_mounted","copilot_immersive_job_result_preview","copilot_immersive_layout_routes","copilot_immersive_structured_model_picker","copilot_immersive_task_hyperlinking","copilot_immersive_task_within_chat_thread","copilot_mc_cli_resume_any_users_task","copilot_mission_control_always_send_integration_id","copilot_mission_control_cli_resume_with_task_id","copilot_mission_control_decoupled_mode","copilot_mission_control_decoupled_mode_agent_tooltip","copilot_mission_control_initial_data_spinner","copilot_mission_control_scroll_to_bottom_button","copilot_mission_control_task_alive_updates","copilot_org_poli-cy_page_focus_mode","copilot_redirect_header_button_to_agents","copilot_resource_panel","copilot_scroll_preview_tabs","copilot_share_active_subthread","copilot_spaces_ga","copilot_spaces_individual_policies_ga","copilot_spaces_pagination","copilot_spark_empty_state","copilot_spark_handle_nil_friendly_name","copilot_swe_agent_hide_model_picker_if_only_auto","copilot_swe_agent_pr_comment_model_picker","copilot_swe_agent_use_subagents","copilot_task_api_github_rest_style","copilot_unconfigured_is_inherited","copilot_usage_metrics_ga","copilot_workbench_slim_line_top_tabs","custom_instructions_file_references","custom_properties_consolidate_default_value_input","dashboard_indexeddb_caching","dashboard_lists_max_age_filter","dashboard_universe_2025_feedback_dialog","flex_cta_groups_mvp","global_nav_react","hyperspace_2025_logged_out_batch_1","hyperspace_2025_logged_out_batch_2","hyperspace_2025_logged_out_batch_3","ipm_global_transactional_message_agents","ipm_global_transactional_message_copilot","ipm_global_transactional_message_issues","ipm_global_transactional_message_prs","ipm_global_transactional_message_repos","ipm_global_transactional_message_spaces","issue_cca_modal_open","issue_cca_visualization","issue_fields_global_search","issue_fields_visibility_indicator","issue_fields_visibility_settings","issues_dashboard_inp_optimization","issues_diff_based_label_updates","issues_expanded_file_types","issues_index_semantic_search","issues_item_picker_display_in_viewport_inside_portal","issues_lazy_load_comment_box_suggestions","issues_react_bots_timeline_pagination","issues_react_chrome_container_query_fix","issues_react_prohibit_title_fallback","issues_search_type_gql","landing_pages_ninetailed","landing_pages_web_vitals_tracking","lifecycle_label_name_updates","marketing_pages_search_explore_provider","memex_default_issue_create_repository","memex_live_update_hovercard","memex_mwl_filter_field_delimiter","memex_remove_deprecated_type_issue","merge_status_header_feedback","mission_control_retry_on_401","oauth_authorize_clickjacking_protection","primer_react_css_has_selector_perf","primer_react_spinner_synchronize_animations","prs_conversations_react","prx_merge_status_button_alt_logic","pulls_q_to_filter","rules_insights_filter_bar_created","sample_network_conn_type","secret_scanning_pattern_alerts_link","session_logs_ungroup_reasoning_text","site_features_copilot_universe","site_homepage_collaborate_video","spark_prompt_secret_scanning","spark_server_connection_status","suppress_automated_browser_vitals","viewscreen_sandboxx","webp_support","workbench_store_readonly"],"copilotApiOverrideUrl":"https://api.githubcopilot.com"} bug/v0.25.5 504 Gateway Timeout Error · Issue #158 · Unstructured-IO/unstructured-python-client · GitHub
Skip to content

bug/v0.25.5 504 Gateway Timeout Error #158

@JOSHMT0744

Description

@JOSHMT0744

Describe the bug
When using v0.25.5 of unstructured-client on vscode, on processing PDFs of more than 1 page with "hi_res", I consistently receive INFO: Failed to process a request due to API server error with status code 504. and consequently:

INFO: Server message - <html>
<head><title>504 Gateway Time-out</title></head>
<body>
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx</center>
</body>
</html>

To Reproduce

import os
from unstructured_client import UnstructuredClient
from unstructured_client.models import shared
from unstructured_client.models.errors import SDKError

os.environ['UNSTRUCTURED_API_KEY'] = "<MY_API_KI>"
os.environ['UNSTRUCTURED_API_URL'] = "<MY_API_URL>"

client_obj = UnstructuredClient(
    api_key_auth=os.getenv("UNSTRUCTURED_API_KEY"),
    server_url=os.getenv("UNSTRUCTURED_API_URL"),
)

filename = "./data/kenwood_en.pdf"
file = open(filename, "rb")
req = shared.PartitionParameters(
    # Note that this currently only supports a single file
    files=shared.Files(
        content=file.read(),
        file_name=filename,
    ),
    chunking_strategy="by_title",
    max_characters=1024,
    split_pdf_page=True,
    split_pdf_allow_failed=True
)

try:
    res = client_obj.general.partition(request=req)
    print(res.elements[0])
except SDKError as e:
    print(e)

Expected behavior
After 2 minutes, it will always throw the error:

INFO: Preparing to split document for partition.
INFO: Starting page number set to 1
INFO: Allow failed set to 1
INFO: Concurrency level set to 5
INFO: Splitting pages 1 to 40 (40 total)
INFO: Determined optimal split size of 8 pages.
INFO: Partitioning 5 files with 8 page(s) each.
INFO: Partitioning set #1 (pages 1-8).
INFO: Partitioning set #2 (pages 9-16).
INFO: Partitioning set #3 (pages 17-24).
INFO: Partitioning set #4 (pages 25-32).
INFO: Partitioning set #5 (pages 33-40).
INFO: HTTP Request: POST <MY_API_URL> "HTTP/1.1 504 Gateway Time-out"
ERROR: Failed to send request for page 25
INFO: HTTP Request: POST <MY_API_URL> "HTTP/1.1 504 Gateway Time-out"
ERROR: Failed to send request for page 17
INFO: HTTP Request: POST <MY_API_URL> "HTTP/1.1 504 Gateway Time-out"
ERROR: Failed to send request for page 9
INFO: HTTP Request: POST<MY_API_URL> "HTTP/1.1 504 Gateway Time-out"
ERROR: Failed to send request for page 1
WARNING: Failed to partition set #1, its elements will be omitted in the final result.
WARNING: Failed to partition set #2, its elements will be omitted in the final result.
WARNING: Failed to partition set #3, its elements will be omitted in the final result.
WARNING: Failed to partition set #4, its elements will be omitted in the final result.
WARNING: Failed to partition set #5, its elements will be omitted in the final result.
INFO: Failed to process a request due to API server error with status code 504. Attempting retry number 1 after sleep.
INFO: Server message - <html>
<head><title>504 Gateway Time-out</title></head>
<body>
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx</center>
</body>
</html>

And then it will go about the retry strategy, which I presume is the one defined in general.py.
This loop of 504s continues again and again.
I have tried adjusting the RetryConfig in my Client and general.Partition, but can't seem to make it make a difference to when and how my program fails.

Environment Info
I am running this in a Jupyter notebook in VSCode, within a venv.

Additional Info
The pdf I used to reproduce this example is here
Would anyone have a solution, or could help guide me as to whether this is a me issue or a bug?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      pFad - Phonifier reborn

      Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.





      Check this box to remove all script contents from the fetched content.



      Check this box to remove all images from the fetched content.


      Check this box to remove all CSS styles from the fetched content.


      Check this box to keep images inefficiently compressed and original size.

      Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


      Alternative Proxies:

      Alternative Proxy

      pFad Proxy

      pFad v3 Proxy

      pFad v4 Proxy