File tree Expand file tree Collapse file tree 1 file changed +9
-5
lines changed
haystack/components/fetchers Expand file tree Collapse file tree 1 file changed +9
-5
lines changed Original file line number Diff line number Diff line change @@ -52,12 +52,16 @@ def _binary_content_handler(response: Response) -> ByteStream:
52
52
@component
53
53
class LinkContentFetcher :
54
54
"""
55
- LinkContentFetcher is a component for fetching and extracting content from URLs.
55
+ Fetches and extracts content from URLs.
56
56
57
- It supports handling various content types, retries on failures, and automatic user-agent rotation for failed web
58
- requests.
57
+ It supports various content types, retries on failures, and automatic user-agent rotation for failed web
58
+ requests. Use it as the data-fetching step in your pipelines.
59
+
60
+ You may need to convert LinkContentFetcher's output into a list of documents. Use HTMLToDocument
61
+ converter to do this.
62
+
63
+ ### Usage example
59
64
60
- Usage example:
61
65
```python
62
66
from haystack.components.fetchers.link_content import LinkContentFetcher
63
67
@@ -84,7 +88,7 @@ def __init__(
84
88
For multiple URLs, it logs errors and returns the content it successfully fetched.
85
89
:param user_agents: [User agents](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent)
86
90
for fetching content. If `None`, a default user agent is used.
87
- :param retry_attempts: Specifies how many times you want it to retry to fetch the URL's content.
91
+ :param retry_attempts: The number of times to retry to fetch the URL's content.
88
92
:param timeout: Timeout in seconds for the request.
89
93
"""
90
94
self .raise_on_failure = raise_on_failure
You can’t perform that action at this time.
0 commit comments