Is there any way to download .tar files from web to local folder fast?

SandeepAllampalli · October 10, 2022, 11:39am

I am trying to download a batch of .tar files from the following website to my local folder. But the download is very slow. Average size of each file is in the range of 30 - 60 MB. Download of these files from web to local folder is very slow. Is there a better way to improve this code so that I can download them fast? Please check the code below:

import requests
from os import mkdir
from os.path import isdir
from bs4 import BeautifulSoup
from os import chdir, getcwd

url = "https://opendata.dwd.de/climate_environment/CDC/grids_germany/hourly/radolan/historical/asc/"

years = [str(year) for year in range(2005,2021)]
links = [url + i + "/" for i in years]

t_links = []

def get_tarlinks():
for i in links:
    #create response object
    r = requests.get(i)
    #create beautiful object
    soup = BeautifulSoup(r.content, 'html5lib')
    #find all links on webpage
    a_links = soup.find_all('a')
    #filter the link sending with .tar
    tar_links = [i + link['href'] for link in a_links if link['href'].endswith('.tar')]
    t_links.append(tar_links)
return t_links

t_links = get_tarlinks()

src_path = "D:/Sandeep/Thesis/Data/"

for i in t_links:
     for j in i:
          year,filename = j.split('/')[10:]
          r = requests.get(j, allow_redirects=True)
          if isdir(src_path+year) == False:
             mkdir(src_path+year)
             chdir(src_path+year)
             open(filename, "wb").write(r.content)
            
         else:
             open(filename, "wb").write(r.content)

Link to the same question posted in stack overflow

Note: Please check the indentation when you copy this code to your IDE. Thanks!

jorahu · October 11, 2022, 7:54am

Hey
I would assume this is very strongly related to your internet connection speed and the limitations of the physical disks wherever the data is stored or just limited access to limit the traffic to the server.

Are you using wifi connection or wired connection? As I understand you are doing this for your thesis, universities usually have quite good internet connections, maybe download data at your university facilities. Or if there is nothing to do about the connection I would recommend writing a script which downloads data automatically (nohup, tmux, slurm or something like that) and leave it to download (for however long it takes). You can work on your further code already when some files have arrived and just make the most out of the time the data is downloading.

SandeepAllampalli · October 11, 2022, 8:34pm

Hey Jorahu,

Thank you for your valuable suggestions. Actually it worked.

Thank you.

Topic		Replies	Views
No space in my local disk to save .asc files of total size around 300GB Openradar	6	218	June 7, 2023
Reading radolan ascii format files Openradar wradlib	3	312	October 20, 2022
RADAR-API - Simplified access to radar archive on the cloud Openradar	0	32	February 3, 2025
Fast Barnes applied to radar data Python	2	283	February 24, 2023
Reading RY files in .bz2 format General	1	182	December 1, 2023

Is there any way to download .tar files from web to local folder fast?

Related topics