Script is broken as a result of website redesign #10

FThompson · 2021-03-02T06:23:19Z

BBC redesigned their website and it lives at a new location: https://sound-effects.bbcrewind.co.uk/

There's no longer a CSV of every file, and the files are named differently, so the script will need to be rewritten to scrape the website and download the files that way.

If the torrent stays active, I might choose not to update this script so we avoid racking up a large cloud service bill for the BBC.

Tobe2d · 2021-04-17T17:20:52Z

+1

After it finished downloading I find out that it is all 6kb ;-(

Hope to see updated script soon

Eptiar · 2021-04-17T22:34:50Z

Hey. I messaged you on reddit just a few minutes ago without realising this. It'd be great to get a new code. I've tried the torrent but with no luck. Also, the new site has 33,00 or so samples now. If you ever make a new code, could you pm me on Reddit? @thatdrummerchap. Cheers! I hope everything goes well

70hundert · 2021-08-11T11:34:44Z

i would be interested in this too! i tried to edit the new location in the code, but without luck so far.

meseck · 2022-01-07T19:27:40Z

It seems that all the old sound files from the csv file are available at least with this URL: "https://sound-effects-media.bbcrewind.co.uk/wav"

With my little bash script (fools-mate/random-sample) it definitely works again.

kitchWWW · 2023-03-06T02:52:56Z

For folks finding this, change the following lines for L59-61:

                filepath = Path('sounds') / folder / (filename+".zip")
                if not filepath.exists():
                    url = 'http://sound-effects-media.bbcrewind.co.uk/zip/' + row['location']+'.zip'

this changes it to have the proper new URL, and will give you a bunch of zip files instead of wav. still working on how to unzip all of these in an intelligent way.

prh78 · 2023-06-07T22:13:12Z

For folks finding this, change the following lines for L59-61:
                filepath = Path('sounds') / folder / (filename+".zip")
                if not filepath.exists():
                    url = 'http://sound-effects-media.bbcrewind.co.uk/zip/' + row['location']+'.zip'
this changes it to have the proper new URL, and will give you a bunch of zip files instead of wav. still working on how to unzip all of these in an intelligent way.

Thanks for this, it worked. I used AI to make a script to unzip and rename the files to match archive, since the meaningful description was lost in the output file. Had to tweak it after the script missed archives with multiple periods in the filename.

import os
import zipfile

def process_archives_recursive(folder_path):
    # Iterate over all files and directories in the folder
    for root, dirs, files in os.walk(folder_path):
        for file in files:
            # Check if the file is a zip archive
            if file.endswith('.zip'):
                archive_path = os.path.join(root, file)
                rename_archive(archive_path)

def rename_archive(archive_path):
    # Extract the directory and filename from the archive path
    directory = os.path.dirname(archive_path)
    filename = os.path.basename(archive_path)

    # Split the filename and extension
    base_name, extension = os.path.splitext(filename)

    # Replace dots with hyphens in the base name
    new_base_name = base_name.replace('.', '-')

    # Create the new filename with the WAV extension
    new_filename = new_base_name + extension

    # Rename the archive file
    new_archive_path = os.path.join(directory, new_filename)
    os.rename(archive_path, new_archive_path)

    print(f"Archive renamed to: {new_archive_path}")

    # Unzip the renamed archive
    unzip_archive(new_archive_path, directory)

def unzip_archive(archive_path, directory):
    # Open the archive
    with zipfile.ZipFile(archive_path, 'r') as zip_ref:
        # Extract all files from the archive
        zip_ref.extractall(directory)

    # Get the list of extracted files
    extracted_files = zip_ref.namelist()

    if len(extracted_files) == 1:
        extracted_file = extracted_files[0]
        extracted_file_path = os.path.join(directory, extracted_file)

        # Split the extracted filename and extension
        extracted_file_name, extracted_file_ext = os.path.splitext(extracted_file)

        # Rename the extracted file to match the archive name with the WAV extension
        new_file_name = os.path.basename(archive_path).split('.')[0] + '.wav'
        new_file_path = os.path.join(directory, new_file_name)
        os.rename(extracted_file_path, new_file_path)

        print(f"File extracted and renamed to: {new_file_path}")
    else:
        print(f"The archive '{os.path.basename(archive_path)}' does not contain a single file.")

# Example usage
folder_path = '/path/to/folder'
process_archives_recursive(folder_path)

Make sure to replace '/path/to/folder' with the actual path to your folder. This merged script first renames the archives without the periods, and then proceeds to extract the renamed archives. If an archive contains a single file, it will be extracted and renamed with the WAV extension. If an archive contains multiple files, it will display a message indicating that it does not contain a single file. The new names and extraction details will be printed as output.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Script is broken as a result of website redesign #10

Script is broken as a result of website redesign #10

FThompson commented Mar 2, 2021 •

edited

Loading

Tobe2d commented Apr 17, 2021

Eptiar commented Apr 17, 2021

70hundert commented Aug 11, 2021

meseck commented Jan 7, 2022

kitchWWW commented Mar 6, 2023 •

edited

Loading

prh78 commented Jun 7, 2023 •

edited

Loading

Script is broken as a result of website redesign #10

Script is broken as a result of website redesign #10

Comments

FThompson commented Mar 2, 2021 • edited Loading

Tobe2d commented Apr 17, 2021

Eptiar commented Apr 17, 2021

70hundert commented Aug 11, 2021

meseck commented Jan 7, 2022

kitchWWW commented Mar 6, 2023 • edited Loading

prh78 commented Jun 7, 2023 • edited Loading

FThompson commented Mar 2, 2021 •

edited

Loading

kitchWWW commented Mar 6, 2023 •

edited

Loading

prh78 commented Jun 7, 2023 •

edited

Loading