Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix FIFO pipe handling #174

Open
wants to merge 33 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
e37da29
Choose the most recent inventory
Sep 5, 2014
2cd80fa
Update GlacierWrapper.py
Adi3000 Jan 3, 2015
1b93f1e
Emit archive ID when uploading a single file
gaul Feb 11, 2015
4115be9
Do not emit None message
gaul Feb 11, 2015
33d6e9e
Handle uploading data from FIFO pipes.
gburca Apr 10, 2015
ccf179e
Fixed typo
gburca Apr 11, 2015
04f756e
added note to --output flag usage re: print mode removing long output…
tphummel Mar 26, 2014
48a0faa
Merge commit 'e37da297'
gburca Apr 11, 2015
4e79008
Merge remote-tracking branch 'adi/master'
gburca Apr 11, 2015
34000d2
Merge remote-tracking branch 'schwabix/update-aws-regions'
gburca Apr 11, 2015
ba68628
Merge remote-tracking branch 'andrewgaul/emit-none'
gburca Apr 11, 2015
4ae2e74
Merge remote-tracking branch 'andrewgaul/emit-archive-id'
gburca Apr 11, 2015
23dc0c0
Retry on 408 HTTP responses
gburca Apr 12, 2015
85ef4aa
Reset retry count after a successful upload.
gburca May 10, 2015
ee48ed3
Fixed http retry error
gburca May 31, 2015
00bdc39
Improved failure logging for HTTP 408.
gburca Jun 14, 2015
27d9f9b
Fixed possible error if logger is None
gburca Aug 30, 2015
1deb408
example download using vault-name and archive-id
gidsg Sep 18, 2015
2c0a07f
Fixed variable usage
gburca Sep 19, 2015
ca1b3f6
Merge remote-tracking branch 'gidsg/patch-1'
gburca Sep 19, 2015
d04ab69
Retry on server errors to handle AWS outages.
gburca Sep 29, 2015
19aef87
Exponential back-off when retrying
gburca Jun 3, 2017
91da994
- added constants.py file and started to move there all the constants
Marzona Jun 10, 2017
641a4b5
Merge pull request #1 from Marzona/master
gburca Jun 12, 2017
6788de8
Try to catch HTTP exception
gburca Jun 20, 2017
d9b0a31
Fix unicode string concatenation
gburca Jun 20, 2017
f09bd44
Fix reporting output format error
gburca Jun 22, 2017
d5ddb78
account-id support has been added
oukooveu Sep 21, 2017
7face20
Merge pull request #1 from yumm-git/glacier-account-id
Sep 21, 2017
9948292
Merge pull request #6 from yumm-git/master
gburca Sep 27, 2017
94d4623
Update README
gburca Sep 27, 2017
0458410
Merge remote-tracking branch 'upstream/master'
gburca Sep 27, 2017
c4df56e
Hide passwords when printing help
gburca Sep 28, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
149 changes: 86 additions & 63 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ To upload from stdin:

$ cat file | glacier-cmd upload Test --description "Some description" --stdin --name /path/BetterName

IMPORTANT NOTE: If you're uploading from stdin, and you don't specify a
IMPORTANT NOTE: If you're uploading from stdin or a pipe, and you don't specify a
--partsize option, your upload will be limited to 1.3Tb, and the progress
report will come out every 128Mb. For more details, run:

Expand All @@ -156,7 +156,8 @@ its file name, its description, or limit the search by region and vault.
If that is not enough you should use `getarchive` and specify the archive ID of
the archive you want to retrieve:

$ TODO: example here
$ glacier-cmd download Test eBLl4DbMbZ4YMA7fD9cNacf2z1kGxpYxBqTV4qFVsuzgjuNlKSkWm2rFpw6Gq-bFT6Vt9cUZ1lGqSbtZjtbeh0jYn9tJC-MczQyA3tP6bezYUeN8dGGvqNqT3la79wjRRair1am1JA --outfile filename
Read 1 GB of 10 GB (10%). Rate 3.05 MB/s, average 2.71 MB/s, ETA 1:00:00.

To remove uploaded archive use `rmarchive`. You can currently delete only by
archive id (notice the use of `--` when the archive ID starts with a dash):
Expand Down Expand Up @@ -221,80 +222,102 @@ Usage description(help):
usage: glacier-cmd [-h] [-c FILE] [--logtostdout]
[--aws-access-key AWS_ACCESS_KEY]
[--aws-secret-key AWS_SECRET_KEY] [--region REGION]
[--bookkeeping]
[--account-id ACCOUNT_ID] [--bookkeeping]
[--no-bookkeeping]
[--bookkeeping-domain-name BOOKKEEPING_DOMAIN_NAME]
[--logfile LOGFILE]
[--loglevel {-1,DEBUG,0,INFO,1,WARNING,2,ERROR,3,CRITICAL}]
[--output {print,csv,json}]


{mkvault,lsvault,describevault,rmvault,upload,listmultiparts,abortmultipart,inventory,getarchive,download,rmarchive,search,listjobs,describejob,treehash}
...
[--output {csv,json,print}]
[--sdb-access-key SDB_ACCESS_KEY]
[--sdb-secret-key SDB_SECRET_KEY] [--sdb-region SDB_REGION]
{mkvault,lsvault,describevault,rmvault,upload,listmultiparts,abortmultipart,inventory,getarchive,download,rmarchive,search,listjobs,describejob,treehash,sns}
...

Command line interface for Amazon Glacier

optional arguments:
-h, --help show this help message and exit
-c FILE, --conf FILE Name of the file to log messages to. (default:
~/.glacier-cmd)
--logtostdout Send log messages to stdout instead of the config
file. (default: False)
-h, --help show this help message and exit
-c FILE, --conf FILE Name of the file to log messages to. (default:
~/.glacier-cmd)
--logtostdout Send log messages to stdout instead of the config
file. (default: False)

Subcommands:
{mkvault,lsvault,describevault,rmvault,upload,listmultiparts,abortmultipart,inventory,getarchive,download,rmarchive,search,listjobs,describejob,treehash}
For subcommand help, use: glacier-cmd <subcommand> -h
mkvault Create a new vault.
lsvault List available vaults.
describevault Describe a vault.
rmvault Remove a vault.
upload Upload an archive to Amazon Glacier.
listmultiparts List all active multipart uploads.
abortmultipart Abort a multipart upload.
inventory List inventory of a vault, if available. If not
available, creates inventory retrieval job if none
running already.
getarchive Requests to make an archive available for download.
download Download a file by archive id.
rmarchive Remove archive from Amazon Glacier.
search Search Amazon SimpleDB database for available archives
(requires bookkeeping to be enabled).
listjobs List active jobs in a vault.
describejob Describe a job.
treehash Calculate the tree-hash (Amazon style sha256-hash) of
a file.
{mkvault,lsvault,describevault,rmvault,upload,listmultiparts,abortmultipart,inventory,getarchive,download,rmarchive,search,listjobs,describejob,treehash,sns}
For subcommand help, use: glacier-cmd <subcommand> -h
mkvault Create a new vault.
lsvault List available vaults.
describevault Describe a vault.
rmvault Remove a vault.
upload Upload an archive to Amazon Glacier.
listmultiparts List all active multipart uploads.
abortmultipart Abort a multipart upload.
inventory List inventory of a vault, if available. If not
available, creates inventory retrieval job if none
running already.
getarchive Requests to make an archive available for download.
download Download a file by archive id.
rmarchive Remove archive from Amazon Glacier.
search Search Amazon SimpleDB database for available archives
(requires bookkeeping to be enabled).
listjobs List active jobs in a vault.
describejob Describe a job.
treehash Calculate the tree-hash (Amazon style sha256-hash) of
a file.
sns Subcommands related to SNS

aws:
--aws-access-key AWS_ACCESS_KEY
Your aws access key (Required if you have not created
a ~/.glacier-cmd or /etc/glacier-cmd.conf config file)
(default: AKIAIP5VPUSCSJQ6BSSQ)
--aws-secret-key AWS_SECRET_KEY
Your aws secret key (Required if you have not created
a ~/.glacier-cmd or /etc/glacier-cmd.conf config file)
(default: WDgq6ZZn7Y4Lkt5LxPuionw2pTLbonwdFZz1BGtS)
--aws-access-key AWS_ACCESS_KEY
Your aws access key (Required if you have not created
a ~/.glacier-cmd or /etc/glacier-cmd.conf config file)
(default: AKIAINJIQK32YOKKYIPA)
--aws-secret-key AWS_SECRET_KEY
Your aws secret key (Required if you have not created
a ~/.glacier-cmd or /etc/glacier-cmd.conf config file)
(default: Tl1NT/8b5sRxr0Dzz9ySUv50hoJM64hGa8QpiL5k)

glacier:
--region REGION Region where you want to store your archives (Required
if you have not created a ~/.glacier-cmd or /etc
/glacier-cmd.conf config file) (default: us-east-1)
--bookkeeping Should we keep book of all created archives. This
requires a Amazon SimpleDB account and its bookkeeping
domain name set (default: True)
--bookkeeping-domain-name BOOKKEEPING_DOMAIN_NAME
Amazon SimpleDB domain name for bookkeeping. (default:
squirrel)
--no-bookkeeping If present, overrides either CLI or configuration file
options provided for bookkeeping either beforehand or
afterwards
--logfile LOGFILE File to write log messages to. (default: /home/wouter
/.glacier-cmd.log)
--loglevel {-1,DEBUG,0,INFO,1,WARNING,2,ERROR,3,CRITICAL}
Set the lowest level of messages you want to log.
(default: DEBUG)
--output {print,csv,json}
Set how to return results: print to the screen, or as
csv resp. json string. (default: print)
--region REGION Region where you want to store your archives (Required
if you have not created a ~/.glacier-cmd or /etc
/glacier-cmd.conf config file) (default: us-east-1)
--account-id ACCOUNT_ID
AWS account ID of the account that owns the vault
(default: -)
--bookkeeping Should we keep book of all created archives. This
requires a Amazon SimpleDB account and its bookkeeping
domain name set (default: False)
--no-bookkeeping Explicitly disables bookkeeping, regardless of other
configuration or command line options. (default:
False)
--bookkeeping-domain-name BOOKKEEPING_DOMAIN_NAME
Amazon SimpleDB domain name for bookkeeping. (default:
amazon-glacier)
--logfile LOGFILE File to write log messages to. (default: /home/gburca
/.glacier-cmd.log)
--loglevel {-1,DEBUG,0,INFO,1,WARNING,2,ERROR,3,CRITICAL}
Set the lowest level of messages you want to log.
(default: WARNING)
--output {csv,json,print}
Set how to return results: print to the screen, or as
csv resp. json string. NOTE: to receive full output
use csv or json. `print` removes lines longer than 138
chars (default: print)

sdb:
--sdb-access-key SDB_ACCESS_KEY
aws access key to be used with bookkeeping (Required
if you have not created a ~/.glacier-cmd or /etc
/glacier-cmd.conf config file) (default:
AKIAINJIQK32YOKKYIPA)
--sdb-secret-key SDB_SECRET_KEY
aws secret key to be used with bookkeeping (Required
if you have not created a ~/.glacier-cmd or /etc
/glacier-cmd.conf config file) (default:
Tl1NT/8b5sRxr0Dzz9ySUv50hoJM64hGa8QpiL5k)
--sdb-region SDB_REGION
Region where you want to store bookkeeping (Required
if you have not created a ~/.glacier-cmd or /etc
/glacier-cmd.conf config file) (default: us-east-1)

SimpleDB bookkeeping (custom) domain name
-----------------------------------------
Expand All @@ -306,7 +329,7 @@ Short Notification Service (SNS) is Amazon's technology that allows you to be no

If you run `glacier-cmd sns sync` without specifing anything in your configuration file, it will automatically subscribe all your vaults to `aws-glacier-notifications` topic.

$ glacier.py sns sync
$ glacier-cmd sns sync
+------------+-------------------------------------------------+
| Vault Name | Request Id |
+------------+-------------------------------------------------+
Expand Down
45 changes: 35 additions & 10 deletions glacier/GlacierWrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
import re
import logging
import os.path
import stat
import time
import sys
import traceback
Expand Down Expand Up @@ -219,14 +220,17 @@ def glacier_connect_wrap(*args, **kwargs):
Connecting to Amazon Glacier with
aws_access_key %s
aws_secret_key %s
region %s\
region %s
account_id %s\
""",
self.aws_access_key,
self.aws_secret_key,
self.region)
self.region,
self.account_id)
self.glacierconn = GlacierConnection(self.aws_access_key,
self.aws_secret_key,
region_name=self.region)
region_name=self.region,
account_id=self.account_id)
except boto.exception.AWSConnectionError as e:
raise ConnectionException(
"Cannot connect to Amazon Glacier.",
Expand Down Expand Up @@ -989,7 +993,13 @@ def upload(self, vault_name, file_name, description, region,
total_size = 0
reader = None
mmapped_file = None
if not stdin:

if stdin:
is_pipe = False
else:
is_pipe = stat.S_ISFIFO(os.stat(file_name).st_mode)

if not stdin and not is_pipe:
if not file_name:
raise InputException(
"No file name given for upload.",
Expand All @@ -1005,6 +1015,15 @@ def upload(self, vault_name, file_name, description, region,
cause=e,
code='FileError')

elif is_pipe:
try:
reader = open(file_name, 'rb')
total_size = 0
except IOError:
raise InputException(
"Could not access pipe: %s."% file_name,
cause = e, code = 'FileError')

elif select.select([sys.stdin,],[],[],0.0)[0]:
reader = sys.stdin
total_size = 0
Expand Down Expand Up @@ -1073,9 +1092,9 @@ def upload(self, vault_name, file_name, description, region,
start, stop = (int(p) for p in part['RangeInBytes'].split('-'))
stop += 1
if not start == current_position:
if stdin:
if stdin or is_pipe:
raise InputException(
'Cannot verify non-sequential upload data from stdin.',
'Cannot verify non-sequential upload data from stdin or pipe.',
code='ResumeError')
if reader:
reader.seek(start)
Expand Down Expand Up @@ -1199,7 +1218,9 @@ def upload(self, vault_name, file_name, description, region,
self.logger.debug(msg)

writer.close()
if not stdin:
if is_pipe:
reader.close()
elif not stdin:
f.close()
current_time = time.time()
overall_rate = int(writer.uploaded_size/(current_time - start_time))
Expand Down Expand Up @@ -1619,7 +1640,7 @@ def inventory(self, vault_name, refresh):
# in progress job.
job_list = self.list_jobs(vault_name)
inventory_done = False
for job in job_list:
for job in sorted(job_list, key=lambda x: x['CompletionDate'], reverse=True):
if job['Action'] == "InventoryRetrieval":

# As soon as a finished inventory job is found, we're done.
Expand Down Expand Up @@ -1938,7 +1959,7 @@ def sns_unsubscribe(self, protocol, endpoint, topic, sns_options):

return unsubscribed

def __init__(self, aws_access_key, aws_secret_key, region,
def __init__(self, aws_access_key, aws_secret_key, region, account_id='-',
bookkeeping=False, no_bookkeeping=None, bookkeeping_domain_name=None,
sdb_access_key=None, sdb_secret_key=None, sdb_region=None,
logfile=None, loglevel='WARNING', logtostdout=True):
Expand All @@ -1951,6 +1972,8 @@ def __init__(self, aws_access_key, aws_secret_key, region,
:type aws_secret_key: str
:param region: name of your default region, see :ref:`regions`.
:type region: str
:param account_id: AWS account ID
:type account_id: str
:param bookkeeping: whether to enable bookkeeping, see :reg:`bookkeeping`.
:type bookkeeping: boolean
:param bookkeeping_domain_name: your Amazon SimpleDB domain name where the bookkeeping information will be stored.
Expand Down Expand Up @@ -1979,6 +2002,7 @@ def __init__(self, aws_access_key, aws_secret_key, region,
self.bookkeeping_domain_name = bookkeeping_domain_name

self.region = region
self.account_id = account_id

self.sdb_access_key = sdb_access_key if sdb_access_key else aws_access_key
self.sdb_secret_key = sdb_secret_key if sdb_secret_key else aws_secret_key
Expand All @@ -1997,6 +2021,7 @@ def __init__(self, aws_access_key, aws_secret_key, region,
nobookkeeping=%s,
bookkeeping_domain_name=%s,
region=%s,
account_id=%s,
sdb_access_key=%s,
sdb_secret_key=%s,
sdb_region=%s,
Expand All @@ -2005,6 +2030,6 @@ def __init__(self, aws_access_key, aws_secret_key, region,
logging to stdout %s.""",
aws_access_key, aws_secret_key, bookkeeping,
no_bookkeeping,
bookkeeping_domain_name, region,
bookkeeping_domain_name, region, account_id,
sdb_access_key, sdb_secret_key, sdb_region,
logfile, loglevel, logtostdout)
Loading