You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It does not work on Reddit, Twitter, Instagram and Tik Tok
After loading the repository and run it on cmd
I wanted to try it on Instagram, Tiktok, Twitter and Reddit
I see this problem on Instagram
I will give you screenshots of the problem
C:\Users\pc\Desktop\media-scraper>python -m mediascraper.instagram instagram Starting PhantomJS web driver... .\webdriver/phantomjsdriver_2.1.1_win32/phantomjs.exe C:\Users\pc\AppData\Local\Programs\Python\Python37-32\lib\site-packages\selenium\webdriver\phantomjs\webdriver.py:49: UserWarning: Selenium support for PhantomJS has been deprecated, please use headle ss versions of Chrome or Firefox instead warnings.warn('Selenium support for PhantomJS has been deprecated, please use headless ' Crawling... Traceback (most recent call last): File "C:\Users\pc\AppData\Local\Programs\Python\Python37-32\lib\runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "C:\Users\pc\AppData\Local\Programs\Python\Python37-32\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "C:\Users\pc\Desktop\media-scraper\mediascraper\instagram.py", line 16, in <module> tasks = scraper.scrape(username) File "C:\Users\pc\Desktop\media-scraper\mediascrapers.py", line 262, in scrape tasks += (task[0], username, task[1]) IndexError: list index out of range
C:\Users\pc\Desktop\media-scraper>python m-scraper.py rq instagram instagram Namespace(credential_file=None, early_stop=False, keywords=['instagram'], save_path=None) Instagramer Task: instagram Traceback (most recent call last): File "m-scraper.py", line 36, in <module> scraper.run(sys.argv[3:]) File "C:\Users\pc\Desktop\media-scraper\m_scraper\rq\downloader.py", line 82, in run self.crawl(keyword, args.early_stop) File "C:\Users\pc\Desktop\media-scraper\m_scraper\rq\instagramer.py", line 46, in crawl tasks, end_cursor, has_next, length, user_id, rhx_gis, csrf_token = get_first_page(username) File "C:\Users\pc\Desktop\media-scraper\m_scraper\rq\utils\instagram.py", line 52, in get_first_page rhx_gis = shared_data['rhx_gis'] KeyError: 'rhx_gis'
tiktok
C:\Users\pc\Desktop\media-scraper>python m-scraper.py rq tiktok tiktok Namespace(credential_file=None, early_stop=False, keywords=['tiktok'], save_path=None) {'statusCode': 10000, 'verifyConfig': {'code': 10000, 'type': 'verify', 'subtype': 'slide', 'fp': 'verify_dd9489ac31d2e6b50a4f9ed75b5240f2', 'region': 'va', 'detail': 'vEyCkJEKBnSe-zq257GFQJrLW03-aOs8 awmNop3PD5IGQA4kjoDDIU6NDQKG7BnEsMWT8C-WHIUjsfHZ9OMgl9009Qcdo2LIOBhGJyNK118AOCRmw8StlADDjuzkZrFHFDTHnSgp2x651wwrNM6-FYFCOlP0izZx6n*pCjcMIM1sjOh0zwAye*FM5lPnHiVJ1eER3KmM*q6VpyCU*uNyTeYkaDpcFOMdgP3br0Hl sWO--*jeaUPVnSjP8RejdrEQgq7oLsXM4rjjf14GhyWBa0H8kj*LODz42UoKrM32r4Fm6VjEAoEeRrjmHVUkwbwAptLOsfmREJTSdtNToMx6t4NqBXWm0mJ24vXdY9Txp83rH49pTmZE1wbEupTi18B1Tw..'}} Traceback (most recent call last): File "m-scraper.py", line 36, in <module> scraper.run(sys.argv[3:]) File "C:\Users\pc\Desktop\media-scraper\m_scraper\rq\downloader.py", line 82, in run self.crawl(keyword, args.early_stop) File "C:\Users\pc\Desktop\media-scraper\m_scraper\rq\tiktoker.py", line 38, in crawl raise Exception('body not found') Exception: body not found
twitter
C:\Users\pc\Desktop\media-scraper>python -m mediascraper.twitter twitter Starting PhantomJS web driver... .\webdriver/phantomjsdriver_2.1.1_win32/phantomjs.exe C:\Users\pc\AppData\Local\Programs\Python\Python37-32\lib\site-packages\selenium\webdriver\phantomjs\webdriver.py:49: UserWarning: Selenium support for PhantomJS has been deprecated, please use headle ss versions of Chrome or Firefox instead warnings.warn('Selenium support for PhantomJS has been deprecated, please use headless ' Crawling... Traceback (most recent call last): File "C:\Users\pc\AppData\Local\Programs\Python\Python37-32\lib\runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "C:\Users\pc\AppData\Local\Programs\Python\Python37-32\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "C:\Users\pc\Desktop\media-scraper\mediascraper\twitter.py", line 18, in <module> tasks = scraper.scrape(username) File "C:\Users\pc\Desktop\media-scraper\mediascrapers.py", line 392, in scrape done = self.scrollToBottom() File "C:\Users\pc\Desktop\media-scraper\mediascrapers.py", line 87, in scrollToBottom last_height, new_height = self._driver.execute_script("return document.body.scrollHeight"), 0 File "C:\Users\pc\AppData\Local\Programs\Python\Python37-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 636, in execute_script 'args': converted_args})['value'] File "C:\Users\pc\AppData\Local\Programs\Python\Python37-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute self.error_handler.check_response(response) File "C:\Users\pc\AppData\Local\Programs\Python\Python37-32\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.WebDriverException: Message: {"errorMessage":"Refused to evaluate a string as JavaScript because 'unsafe-eval' is not an allowed source of script in the following Content Se curity Policy directive: \"script-src 'self' 'unsafe-inline' https://*.twimg.com https://recaptcha.net/recaptcha/ https://www.google.com/recaptcha/ https://www.gstatic.com/recaptcha/ https://www.googl e-analytics.com https://twitter.com https://app.link https://accounts.google.com/gsi/client https://appleid.cdn-apple.com/appleauth/static/jsapi/appleid/1/en_US/appleid.auth.js 'nonce-MDk3YWFmNWEtOGR kZS00NGQzLWE2MjMtZjUzNzhjMGZhZGJl'\".\n","request":{"headers":{"Accept":"application/json","Accept-Encoding":"identity","Content-Length":"112","Content-Type":"application/json;charset=UTF-8","Host":"1 27.0.0.1:55938","User-Agent":"selenium/3.141.0 (python windows)"},"httpVersion":"1.1","method":"POST","post":"{\"script\": \"return document.body.scrollHeight\", \"args\": [], \"sessionId\": \"7c44f2b 0-7a92-11ec-a2dc-d38583c010ce\"}","url":"/execute","urlParsed":{"anchor":"","query":"","file":"execute","directory":"/","path":"/execute","relative":"/execute","port":"","host":"","password":"","user" :"","userInfo":"","authority":"","protocol":"","source":"/execute","queryKey":{},"chunks":["execute"]},"urlOriginal":"/session/7c44f2b0-7a92-11ec-a2dc-d38583c010ce/execute"}} Screenshot: available via screen
I also tried set the web driver to 777 for convenience. C:\Users\pc\Desktop\media-scraper>chmod 777 webdriver/phantomjsdriver_2.1.1_win32/phantomjs.exe 'chmod' is not recognized as an internal or external command, operable program or batch file.
I wish this tool was working because it would be the best tool on the internet
Greetings to all
hi
Windows 7
Python 3.7.6
pip 21.3.1
It does not work on Reddit, Twitter, Instagram and Tik Tok
After loading the repository and run it on cmd
I wanted to try it on Instagram, Tiktok, Twitter and Reddit
I see this problem on Instagram
I will give you screenshots of the problem
C:\Users\pc\Desktop\media-scraper>python -m mediascraper.instagram instagram Starting PhantomJS web driver... .\webdriver/phantomjsdriver_2.1.1_win32/phantomjs.exe C:\Users\pc\AppData\Local\Programs\Python\Python37-32\lib\site-packages\selenium\webdriver\phantomjs\webdriver.py:49: UserWarning: Selenium support for PhantomJS has been deprecated, please use headle ss versions of Chrome or Firefox instead warnings.warn('Selenium support for PhantomJS has been deprecated, please use headless ' Crawling... Traceback (most recent call last): File "C:\Users\pc\AppData\Local\Programs\Python\Python37-32\lib\runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "C:\Users\pc\AppData\Local\Programs\Python\Python37-32\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "C:\Users\pc\Desktop\media-scraper\mediascraper\instagram.py", line 16, in <module> tasks = scraper.scrape(username) File "C:\Users\pc\Desktop\media-scraper\mediascrapers.py", line 262, in scrape tasks += (task[0], username, task[1]) IndexError: list index out of range
C:\Users\pc\Desktop\media-scraper>python m-scraper.py rq instagram instagram Namespace(credential_file=None, early_stop=False, keywords=['instagram'], save_path=None) Instagramer Task: instagram Traceback (most recent call last): File "m-scraper.py", line 36, in <module> scraper.run(sys.argv[3:]) File "C:\Users\pc\Desktop\media-scraper\m_scraper\rq\downloader.py", line 82, in run self.crawl(keyword, args.early_stop) File "C:\Users\pc\Desktop\media-scraper\m_scraper\rq\instagramer.py", line 46, in crawl tasks, end_cursor, has_next, length, user_id, rhx_gis, csrf_token = get_first_page(username) File "C:\Users\pc\Desktop\media-scraper\m_scraper\rq\utils\instagram.py", line 52, in get_first_page rhx_gis = shared_data['rhx_gis'] KeyError: 'rhx_gis'
tiktok
C:\Users\pc\Desktop\media-scraper>python m-scraper.py rq tiktok tiktok Namespace(credential_file=None, early_stop=False, keywords=['tiktok'], save_path=None) {'statusCode': 10000, 'verifyConfig': {'code': 10000, 'type': 'verify', 'subtype': 'slide', 'fp': 'verify_dd9489ac31d2e6b50a4f9ed75b5240f2', 'region': 'va', 'detail': 'vEyCkJEKBnSe-zq257GFQJrLW03-aOs8 awmNop3PD5IGQA4kjoDDIU6NDQKG7BnEsMWT8C-WHIUjsfHZ9OMgl9009Qcdo2LIOBhGJyNK118AOCRmw8StlADDjuzkZrFHFDTHnSgp2x651wwrNM6-FYFCOlP0izZx6n*pCjcMIM1sjOh0zwAye*FM5lPnHiVJ1eER3KmM*q6VpyCU*uNyTeYkaDpcFOMdgP3br0Hl sWO--*jeaUPVnSjP8RejdrEQgq7oLsXM4rjjf14GhyWBa0H8kj*LODz42UoKrM32r4Fm6VjEAoEeRrjmHVUkwbwAptLOsfmREJTSdtNToMx6t4NqBXWm0mJ24vXdY9Txp83rH49pTmZE1wbEupTi18B1Tw..'}} Traceback (most recent call last): File "m-scraper.py", line 36, in <module> scraper.run(sys.argv[3:]) File "C:\Users\pc\Desktop\media-scraper\m_scraper\rq\downloader.py", line 82, in run self.crawl(keyword, args.early_stop) File "C:\Users\pc\Desktop\media-scraper\m_scraper\rq\tiktoker.py", line 38, in crawl raise Exception('body not found') Exception: body not found
twitter
C:\Users\pc\Desktop\media-scraper>python -m mediascraper.twitter twitter Starting PhantomJS web driver... .\webdriver/phantomjsdriver_2.1.1_win32/phantomjs.exe C:\Users\pc\AppData\Local\Programs\Python\Python37-32\lib\site-packages\selenium\webdriver\phantomjs\webdriver.py:49: UserWarning: Selenium support for PhantomJS has been deprecated, please use headle ss versions of Chrome or Firefox instead warnings.warn('Selenium support for PhantomJS has been deprecated, please use headless ' Crawling... Traceback (most recent call last): File "C:\Users\pc\AppData\Local\Programs\Python\Python37-32\lib\runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "C:\Users\pc\AppData\Local\Programs\Python\Python37-32\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "C:\Users\pc\Desktop\media-scraper\mediascraper\twitter.py", line 18, in <module> tasks = scraper.scrape(username) File "C:\Users\pc\Desktop\media-scraper\mediascrapers.py", line 392, in scrape done = self.scrollToBottom() File "C:\Users\pc\Desktop\media-scraper\mediascrapers.py", line 87, in scrollToBottom last_height, new_height = self._driver.execute_script("return document.body.scrollHeight"), 0 File "C:\Users\pc\AppData\Local\Programs\Python\Python37-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 636, in execute_script 'args': converted_args})['value'] File "C:\Users\pc\AppData\Local\Programs\Python\Python37-32\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute self.error_handler.check_response(response) File "C:\Users\pc\AppData\Local\Programs\Python\Python37-32\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response raise exception_class(message, screen, stacktrace) selenium.common.exceptions.WebDriverException: Message: {"errorMessage":"Refused to evaluate a string as JavaScript because 'unsafe-eval' is not an allowed source of script in the following Content Se curity Policy directive: \"script-src 'self' 'unsafe-inline' https://*.twimg.com https://recaptcha.net/recaptcha/ https://www.google.com/recaptcha/ https://www.gstatic.com/recaptcha/ https://www.googl e-analytics.com https://twitter.com https://app.link https://accounts.google.com/gsi/client https://appleid.cdn-apple.com/appleauth/static/jsapi/appleid/1/en_US/appleid.auth.js 'nonce-MDk3YWFmNWEtOGR kZS00NGQzLWE2MjMtZjUzNzhjMGZhZGJl'\".\n","request":{"headers":{"Accept":"application/json","Accept-Encoding":"identity","Content-Length":"112","Content-Type":"application/json;charset=UTF-8","Host":"1 27.0.0.1:55938","User-Agent":"selenium/3.141.0 (python windows)"},"httpVersion":"1.1","method":"POST","post":"{\"script\": \"return document.body.scrollHeight\", \"args\": [], \"sessionId\": \"7c44f2b 0-7a92-11ec-a2dc-d38583c010ce\"}","url":"/execute","urlParsed":{"anchor":"","query":"","file":"execute","directory":"/","path":"/execute","relative":"/execute","port":"","host":"","password":"","user" :"","userInfo":"","authority":"","protocol":"","source":"/execute","queryKey":{},"chunks":["execute"]},"urlOriginal":"/session/7c44f2b0-7a92-11ec-a2dc-d38583c010ce/execute"}} Screenshot: available via screen
I also tried set the web driver to 777 for convenience.
C:\Users\pc\Desktop\media-scraper>chmod 777 webdriver/phantomjsdriver_2.1.1_win32/phantomjs.exe 'chmod' is not recognized as an internal or external command, operable program or batch file.
I wish this tool was working because it would be the best tool on the internet
Greetings to all
@elvisyjlin
The text was updated successfully, but these errors were encountered: