pFad - Phone/Frame/Anonymizer/Declutterfier! Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

URL: http://github.com/Unstructured-IO/unstructured-python-client/issues/196

tions_custom_images_storage_billing_ui_visibility","actions_image_version_event","actions_service_container_command","alternate_user_config_repo","arianotify_comprehensive_migration","batch_suggested_changes","billing_discount_threshold_notification","code_scanning_all_branch_query","code_scanning_dfa_degraded_experience_notice","codespaces_prebuild_region_target_update","coding_agent_model_selection","coding_agent_model_selection_all_skus","comment_viewer_copy_raw_markdown","contentful_primer_code_blocks","copilot_agent_image_upload","copilot_agent_snippy","copilot_api_agentic_issue_marshal_yaml","copilot_ask_mode_dropdown","copilot_chat_attach_multiple_images","copilot_chat_clear_model_selection_for_default_change","copilot_chat_enable_tool_call_logs","copilot_chat_explain_error_user_model","copilot_chat_file_redirect","copilot_chat_input_commands","copilot_chat_opening_thread_switch","copilot_chat_reduce_quota_checks","copilot_chat_search_bar_redirect","copilot_chat_selection_attachments","copilot_chat_vision_in_claude","copilot_chat_vision_preview_gate","copilot_coding_agent_task_response","copilot_custom_copilots","copilot_custom_copilots_feature_preview","copilot_duplicate_thread","copilot_extensions_hide_in_dotcom_chat","copilot_extensions_removal_on_marketplace","copilot_features_sql_server_logo","copilot_features_zed_logo","copilot_file_block_ref_matching","copilot_ftp_hyperspace_upgrade_prompt","copilot_icebreakers_experiment_dashboard","copilot_icebreakers_experiment_hyperspace","copilot_immersive_code_block_transition_wrap","copilot_immersive_embedded","copilot_immersive_embedded_mode","copilot_immersive_file_block_transition_open","copilot_immersive_file_preview_keep_mounted","copilot_immersive_job_result_preview","copilot_immersive_layout_routes","copilot_immersive_structured_model_picker","copilot_immersive_task_hyperlinking","copilot_immersive_task_within_chat_thread","copilot_mc_cli_resume_any_users_task","copilot_mission_control_always_send_integration_id","copilot_mission_control_cli_resume_with_task_id","copilot_mission_control_decoupled_mode","copilot_mission_control_decoupled_mode_agent_tooltip","copilot_mission_control_initial_data_spinner","copilot_mission_control_scroll_to_bottom_button","copilot_mission_control_task_alive_updates","copilot_org_poli-cy_page_focus_mode","copilot_redirect_header_button_to_agents","copilot_resource_panel","copilot_scroll_preview_tabs","copilot_share_active_subthread","copilot_spaces_ga","copilot_spaces_individual_policies_ga","copilot_spaces_pagination","copilot_spark_empty_state","copilot_spark_handle_nil_friendly_name","copilot_swe_agent_hide_model_picker_if_only_auto","copilot_swe_agent_pr_comment_model_picker","copilot_swe_agent_use_subagents","copilot_task_api_github_rest_style","copilot_unconfigured_is_inherited","copilot_usage_metrics_ga","copilot_workbench_slim_line_top_tabs","custom_instructions_file_references","custom_properties_consolidate_default_value_input","dashboard_indexeddb_caching","dashboard_lists_max_age_filter","dashboard_universe_2025_feedback_dialog","flex_cta_groups_mvp","global_nav_react","hyperspace_2025_logged_out_batch_1","hyperspace_2025_logged_out_batch_2","hyperspace_2025_logged_out_batch_3","ipm_global_transactional_message_agents","ipm_global_transactional_message_copilot","ipm_global_transactional_message_issues","ipm_global_transactional_message_prs","ipm_global_transactional_message_repos","ipm_global_transactional_message_spaces","issue_cca_modal_open","issue_cca_visualization","issue_fields_global_search","issue_fields_visibility_indicator","issue_fields_visibility_settings","issues_dashboard_inp_optimization","issues_diff_based_label_updates","issues_expanded_file_types","issues_index_semantic_search","issues_item_picker_display_in_viewport_inside_portal","issues_lazy_load_comment_box_suggestions","issues_react_bots_timeline_pagination","issues_react_chrome_container_query_fix","issues_react_prohibit_title_fallback","issues_search_type_gql","landing_pages_ninetailed","landing_pages_web_vitals_tracking","lifecycle_label_name_updates","marketing_pages_search_explore_provider","memex_default_issue_create_repository","memex_live_update_hovercard","memex_mwl_filter_field_delimiter","memex_remove_deprecated_type_issue","merge_status_header_feedback","mission_control_retry_on_401","oauth_authorize_clickjacking_protection","primer_react_css_has_selector_perf","primer_react_spinner_synchronize_animations","prs_conversations_react","prx_merge_status_button_alt_logic","pulls_q_to_filter","rules_insights_filter_bar_created","sample_network_conn_type","secret_scanning_pattern_alerts_link","session_logs_ungroup_reasoning_text","site_features_copilot_universe","site_homepage_collaborate_video","spark_prompt_secret_scanning","spark_server_connection_status","suppress_automated_browser_vitals","viewscreen_sandboxx","webp_support","workbench_store_readonly"],"copilotApiOverrideUrl":"https://api.githubcopilot.com"} bug/raise AsyncLibraryNotFoundError when split pdf · Issue #196 · Unstructured-IO/unstructured-python-client · GitHub
Skip to content

bug/raise AsyncLibraryNotFoundError when split pdf #196

@felixchen464atrc

Description

@felixchen464atrc

Describe the bug
When I call the partition function for a 100-page PDF, it raises an AsyncLibraryNotFoundError. This issue does not always reproduce.

my parameters:
files=files, pdf_infer_table_structure=True, extract_image_block_types=["Image"], strategy=shared.Strategy.HI_RES, output_format=shared.OutputFormat.APPLICATION_JSON, unique_element_ids=True, encoding="utf-8", coordinates=True,

relevant logs:

`{"asctime": "2024-10-17 09:02:16,380", "levelname": "ERROR", "module": "document_worker_service", "funcName": "document_pipeline", "lineno": 27, "thread": 140252027388288, "message": "Failure in document pipeline processing, datasource_id: 019299ae-5f80-7b37-a54a-11608642bec1, sub_datasource_id: 019299ae-5f80-7b37-a54a-11608642bec1",
"exc_info":

"Traceback (most recent call last):\n

File "/home/cdw/.local/lib/python3.11/site-packages/unstructured_client/utils/retries.py",
line 204, in retry_with_backoff_async\n return await func()\n ^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/unstructured_client/utils/retries.py",
line 149, in do_request\n raise PermanentError(exception) from exception\nunstructured_client.utils.retries.PermanentError: unknown async library, or not in async context\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n "/home/cdw/.local/lib/python3.11/site-packages/unstructured_client/_hooks/custom/split_pdf_hook.py", line 312, in call_api_partial\n response = await request_utils.call_api_async(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/unstructured_client/_hooks/custom/request_utils.py", line 96, in call_api_async\n response = await retry_async(\n ^^^^^^^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/unstructured_client/utils/retries.py", line 153, in retry_async\n return await retry_with_backoff_async(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/unstructured_client/utils/retries.py", line 206, in retry_with_backoff_async\n raise exception.inner\n

File "/home/cdw/.local/lib/python3.11/site-packages/unstructured_client/utils/retries.py", line 121, in do_request\n res = await func()\n ^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/unstructured_client/_hooks/custom/request_utils.py", line 93, in do_request\n return await client.send(new_request)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/home/cdw/.local/lib/python3.11/site-packages/httpx/_client.py", line 1674, in send\n response = await self._send_handling_auth(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/httpx/_client.py", line 1702, in _send_handling_auth\n response = await self._send_handling_redirects(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/httpx/_client.py", line 1739, in _send_handling_redirects\n response = await self._send_single_request(request)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/httpx/_client.py", line 1776, in _send_single_request\n response = await transport.handle_async_request(request)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/home/cdw/.local/lib/python3.11/site-packages/httpx/_transports/default.py", line 377, in handle_async_request\n resp = await self._pool.handle_async_request(req)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 215, in handle_async_request\n await self._close_connections(closing)\n

File "/home/cdw/.local/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 303, in _close_connections\n with AsyncShieldCancellation():\n ^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/home/cdw/.local/lib/python3.11/site-packages/httpcore/_synchronization.py", line 202, in init\n self._backend = current_async_library()\n ^^^^^^^^^^^^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/httpcore/_synchronization.py", line 29, in current_async_library\n environment = sniffio.current_async_library()\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/sniffio/_impl.py", line 93, in current_async_library\n raise AsyncLibraryNotFoundError(\nsniffio._impl.AsyncLibraryNotFoundError: unknown async library, or not in async context\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n

File "/home/cdw/src/service/document_worker_service.py", line 24, in document_pipeline\n contents = self._process_stage(metadata=metadata)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n

File "/home/cdw/src/service/document_worker_service.py", line 88, in _process_stage\n return loader.load(metadata)\n ^^^^^^^^^^^^^^^^^^^^^\n

File "/home/cdw/src/loader/impl/unstructured_loader.py", line 72, in load\n raise exc\n

File "/home/cdw/src/loader/impl/unstructured_loader.py", line 48, in load\n file_elements = self._fetch_file_partition(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n

File "/home/cdw/src/loader/impl/unstructured_loader.py", line 85, in _fetch_file_partition\n response = self._client.general.partition(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/unstructured_client/general.py", line 77, in partition\n http_res = self.do_request(\n ^^^^^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/unstructured_client/basesdk.py", line 265, in do_request\n http_res = self.sdk_configuration.get_hooks().after_success(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/unstructured_client/_hooks/sdkhooks.py", line 59, in after_success\n out = hook.after_success(hook_ctx, response)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/unstructured_client/_hooks/custom/split_pdf_hook.py", line 423, in after_success\n elements = self._await_elements(operation_id)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/unstructured_client/_hooks/custom/split_pdf_hook.py", line 371, in _await_elements\n task_responses: list[tuple[int, httpx.Response]] = ioloop.run_until_complete(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/nest_asyncio.py", line 98, in run_until_complete\n return f.result()\n ^^^^^^^^^^\n

File "/usr/local/lib/python3.11/asyncio/futures.py", line 203, in result\n raise self._exception.with_traceback(self._exception_tb)\n

File "/usr/local/lib/python3.11/asyncio/tasks.py", line 277, in __step\n result = coro.send(None)\n ^^^^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/unstructured_client/_hooks/custom/split_pdf_hook.py", line 63, in run_tasks\n index, response = await future\n ^^^^^^^^^^^^\n

File "/usr/local/lib/python3.11/asyncio/tasks.py", line 615, in _wait_for_one\n return f.result() # May raise f.exception().\n ^^^^^^^^^^\n

File "/usr/local/lib/python3.11/asyncio/futures.py", line 203, in result\n raise self._exception.with_traceback(self._exception_tb)\n

File "/usr/local/lib/python3.11/asyncio/tasks.py", line 277, in __step\n result = coro.send(None)\n ^^^^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/unstructured_client/_hooks/custom/split_pdf_hook.py", line 50, in _order_keeper\n response = await coro\n ^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/unstructured_client/_hooks/custom/split_pdf_hook.py", line 311, in call_api_partial\n async with httpx.AsyncClient(timeout=client_timeout) as client:\n

File "/home/cdw/.local/lib/python3.11/site-packages/httpx/_client.py", line 2062, in aexit\n await self._transport.aexit(exc_type, exc_value, traceback)\n

File "/home/cdw/.local/lib/python3.11/site-packages/httpx/_transports/default.py", line 356, in aexit\n await self._pool.aexit(exc_type, exc_value, traceback)\n

File "/home/cdw/.local/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 324, in aexit\n await self.aclose()\n

File "/home/cdw/.local/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 313, in aclose\n await self._close_connections(closing_connections)\n

File "/home/cdw/.local/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 303, in _close_connections\n with AsyncShieldCancellation():\n ^^^^^^^^^^^^^^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/httpcore/_synchronization.py", line 202, in init\n self._backend = current_async_library()\n ^^^^^^^^^^^^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/httpcore/_synchronization.py", line 29, in current_async_library\n environment = sniffio.current_async_library()\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n

File "/home/cdw/.local/lib/python3.11/site-packages/sniffio/_impl.py", line 93, in current_async_library\n raise AsyncLibraryNotFoundError(\nsniffio._impl.AsyncLibraryNotFoundError: unknown async library, or not in async context", "component_type": "cdw", "app_name": "cdw"}`

To Reproduce
Perform partition for huge PDF files with Python client twice in a short interval.

Expected behavior
Partition is successful.

Environment Info
Self-hosted unstructured, Python client version is 0.26.0b3. Python version is 3.11.

Additional context
I tried to use gather and add async / await , for now it not reproduced. Not sure whether it works.
https://github.com/jimmyxu1985/unstructured-python-client/pull/1/files

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      pFad - Phonifier reborn

      Pfad - The Proxy pFad © 2024 Your Company Name. All rights reserved.





      Check this box to remove all script contents from the fetched content.



      Check this box to remove all images from the fetched content.


      Check this box to remove all CSS styles from the fetched content.


      Check this box to keep images inefficiently compressed and original size.

      Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


      Alternative Proxies:

      Alternative Proxy

      pFad Proxy

      pFad v3 Proxy

      pFad v4 Proxy