Skip to content
This repository has been archived by the owner on Jan 14, 2020. It is now read-only.
/ pycosio Public archive

Commit

Permalink
Fix documentation, test and release 1.3.0
Browse files Browse the repository at this point in the history
  • Loading branch information
JGoutin committed Mar 29, 2019
1 parent e3d5a52 commit 7739890
Show file tree
Hide file tree
Showing 17 changed files with 271 additions and 128 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,20 +17,20 @@ Pycosio brings standard Python I/O to cloud objects by providing:
* Features equivalent to the standard library (``io``, ``os``, ``os.path``,
``shutil``) for seamlessly managing cloud objects and local files.

Theses functions are source agnostic and always provide the same interface for
These functions are source agnostic and always provide the same interface for
all files from cloud storage or local file systems.

Buffered cloud objects also support following features:
Buffered cloud objects also support the following features:

* Buffered asynchronous writing of any object size.
* Buffered asynchronous preloading in read mode.
* Buffered asynchronous preloading in reading mode.
* Write or read lock depending on memory usage limitation.
* Maximization of bandwidth using parallels connections.

Supported Cloud storage
-----------------------

Pycosio is compatible with following cloud objects storage services:
Pycosio is compatible with the following cloud objects storage services:

* Alibaba Cloud OSS
* Amazon Web Services S3
Expand Down
2 changes: 1 addition & 1 deletion docs/api_storage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ The following table shows features available for each storage.
API
---

The following pages describes each storage.
The following pages describe each storage.

.. automodule:: pycosio.storage
:members:
Expand Down
86 changes: 85 additions & 1 deletion docs/api_storage_azure_blob.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ An Azure storage account can be mounted using the Pycosio ``mount`` function.
``azure.storage.blob.baseblobservice.BaseBlobService`` class from
``azure-storage-blob`` Python library.

This example show the mount of Azure Storage Blob with the minimal
This example shows the mount of Azure Storage Blob with the minimal
configuration:

.. code-block:: python
Expand Down Expand Up @@ -45,6 +45,90 @@ Limitation

Only one configuration per Azure Storage account can be mounted simultaneously.

Azure blob type selection
-------------------------

It is possible to select the blob type for new files created using the
``blob_type`` argument.

Possible blob types are ``BlockBlob``, ``AppendBlob`` & ``PageBlob``.

The default blob type can be set when mounting the storage
(if not specified, the ``BlockBlob`` type is used by default):

.. code-block:: python
import pycosio
pycosio.mount(storage='azure_blob', storage_parameters=dict(
account_name='my_account_name', account_key='my_account_key',
# Using PageBlob by default for new files
blob_type='PageBlob',
)
)
It can also be selected for a specific file when opening it in write mode:

.. code-block:: python
# Open a new file in write mode as PageBlob
with pycosio.open(
'https://my_account.blob.core.windows.net/my_container/my_blob',
'wb', blob_type='PageBlob') as file:
file.write(b'0')
Page blobs specific features
----------------------------

The page blob supports the following specific features.

Preallocating pages
~~~~~~~~~~~~~~~~~~~

When flushing a page blob out of its current size, pycosio first resize the
blob to allow the flush of the new data.

In case of multiple flushes on a raw IO or when using a buffered IO, this is
done with extra requests to the Azure server. If The size to write is known
before opening the file, it is possible to avoid these extra requests by
to preallocate the required size in only one initial request.

The ``content_length`` argument allow preallocating a Page blob to a defined
size when opening it in write mode:

.. code-block:: python
# Open a new page blob and preallocate it with 1024 bytes.
with pycosio.open(
'https://my_account.blob.core.windows.net/my_container/my_blob',
'wb', blob_type='PageBlob', content_length=1024) as file:
file.write(b'1')
# Append on an existing page blob and pre-resize it to 2048 bytes.
with pycosio.open(
'https://my_account.blob.core.windows.net/my_container/my_blob',
'ab', blob_type='PageBlob', content_length=2048) as file:
file.write(b'1')
The preallocation is done with padding of null characters (``b'\0'``).

End page padding handling
~~~~~~~~~~~~~~~~~~~~~~~~~

By default, Pycosio tries to handle page blobs like standard files by ignoring
trailing page padding of null characters:

* When opening a file in append mode (Seek to the end of file after ignoring
trailing padding of the last page).
* When reading data (Read until a null character reaches).
* When using the ``seek()`` method with ``whence=os.SEEK_END`` (Ignore the
trailing padding when determining the end of the file to use as the reference
position)

This behaviour can be disabled using the ``ignore_padding=False`` argument when
opening the page blob.

Files objects classes
---------------------

Expand Down
33 changes: 32 additions & 1 deletion docs/api_storage_azure_file.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ An Azure storage account can be mounted using the Pycosio ``mount`` function.
``azure.storage.file.fileservice.FileService`` class from
``azure-storage-file`` Python library.

This example show the mount of Azure Storage File with the minimal
This example shows the mount of Azure Storage File with the minimal
configuration:

.. code-block:: python
Expand Down Expand Up @@ -45,6 +45,37 @@ Limitation

Only one configuration per Azure Storage account can be mounted simultaneously.

Preallocating files
-------------------

When flushing a file out of its current size, pycosio first resize the
file to allow the flush of the new data.

In case of multiple flushes on a raw IO or when using a buffered IO, this is
done with extra requests to the Azure server. If The size to write is known
before opening the file, it is possible to avoid these extra requests by
to preallocate the required size in only one initial request.

The ``content_length`` argument allow preallocating a file to a defined
size when opening it in write mode:

.. code-block:: python
# Open a new file and preallocate it with 1024 bytes.
with pycosio.open(
'//my_account.file.core.windows.net/my_share/my_file',
'wb', content_length=1024) as file:
file.write(b'1')
# Append on an existing file and pre-resize it to 2048 bytes.
with pycosio.open(
'//my_account.file.core.windows.net/my_share/my_file',
'ab', content_length=2048) as file:
file.write(b'1')
The preallocation is done with padding of null characters (``b'\0'``).


Files objects classes
---------------------

Expand Down
7 changes: 4 additions & 3 deletions docs/api_storage_http.rst
Original file line number Diff line number Diff line change
@@ -1,14 +1,15 @@
pycosio.storage.http
====================

HTTP/HTTPS object read only access.
HTTP/HTTPS object read-only access.

Mount
-----

The HTTP storage does not require to be mounted prior to be used.
The HTTP storage does not require to be mounted prior to being used.

Function can be used directly on any HTTP object reachable by the Pycosio host:
The function can be used directly on any HTTP object reachable by the Pycosio
host:

.. code-block:: python
Expand Down
4 changes: 2 additions & 2 deletions docs/api_storage_oss.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,11 @@ OSS can be mounted using the Pycosio ``mount`` function.
``oss2`` Python library (The class selection is done automatically based on
parameters found in ``storage_parameters``).

Pycosio also require one extra argument, the ``endpoint``, which is basically
Pycosio also requires one extra argument, the ``endpoint``, which is basically
the URL of the OSS Alibaba region to use. (See ``endpoint`` argument of the
``oss2.Bucket`` class)

This example show the mount of OSS with the minimal configuration:
This example shows the mount of OSS with the minimal configuration:

.. code-block:: python
Expand Down
6 changes: 3 additions & 3 deletions docs/api_storage_s3.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ arguments to pass to ``boto3.session.Session(**session_parameters)`` from the
It can also include a sub-directory ``client`` that is used to pass arguments to
``boto3.session.Session.client('s3', **client_parameters)``.

This example show the mount of S3 with the minimal configuration:
This example shows the mount of S3 with the minimal configuration:

.. code-block:: python
Expand All @@ -39,12 +39,12 @@ This example show the mount of S3 with the minimal configuration:
Automatic mount
~~~~~~~~~~~~~~~

It is not required to mount S3 explicitly When using pycosio on a host
It is not required to mount S3 explicitly when using pycosio on a host
configured to handle AWS S3 access (Through IAM policy, configuration
files, environment variables, ...).

In this case, mounting is done transparently on the first call of a Pycosio
function on a S3 object and no configuration or extra steps are required:
function on an S3 object and no configuration or extra steps are required:

.. code-block:: python
Expand Down
2 changes: 1 addition & 1 deletion docs/api_storage_swift.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ An OpenStack Swift project can be mounted using the Pycosio ``mount`` function.
``swiftclient.client.Connection`` class from ``python-swiftclient`` Python
library.

This example show the mount of OpenStack Swift with the minimal configuration:
This example shows the mount of OpenStack Swift with a minimal configuration:

.. code-block:: python
Expand Down
47 changes: 24 additions & 23 deletions docs/changes.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Changelog
=========

1.3.0 (2019/??)
1.3.0 (2019/03)
---------------

Add support for following cloud storage:
Expand All @@ -11,8 +11,8 @@ Add support for following cloud storage:

Improvements:

* ``io.RawIOBase`` can now be used for storage that support random write access.
* OSS: Copy objects between OSS buckets without copying data on client when
* ``io.RawIOBase`` can now be used for storage that supports random write access.
* OSS: Copy objects between OSS buckets without copying data on the client when
possible.

Deprecations:
Expand All @@ -21,23 +21,25 @@ Deprecations:

Fixes:

* Fix file methods not translate Cloud Storage exception into OSError.
* Fix unsupported operation not risen in all cases with raw and buffered IO.
* Fix call of ``flush()`` in buffered IO.
* Fix file methods not translate cloud storage exception into ``OSError``.
* Fix file not create on open in write mode (Was only created on flush).
* Fix file closed twice when using context manager.
* Fix root URL detection in some cases.
* Fix too many returned result when listing objects with a count limit.
* Fix error when trying to append on a not existing file.
* Fix ``io.RawIOBase`` not generating padding when seeking after end of file.
* Fix ``io.RawIOBase`` not generating padding when seeking after the end of the file.
* OSS: Fix error when listing objects in a not existing directory.
* OSS: Fix read error if try to read after end of file.
* OSS: Fix read error if try to read after the end of the file.
* OSS: Fix buffered write minimum buffer size.
* OSS: Clean up multipart upload parts on failed uploads.
* OSS: Fix error when opening existing file in 'a' mode.
* S3: Fix error when creating a bucket due to unspecified region.
* S3: Fix unprocessed error in listing bucket content of an not existing bucket.
* OSS: Fix error when opening an existing file in 'a' mode.
* S3: Fix error when creating a bucket due to an unspecified region.
* S3: Fix unprocessed error in listing bucket content of a not existing bucket.
* S3: Clean up multipart upload parts on failed uploads.
* S3: Fix missing transfer acceleration endpoints.
* Swift: Fix error when opening existing file in 'a' mode.
* Swift: Fix error when opening an existing file in 'a' mode.

1.2.0 (2018/10)
---------------
Expand All @@ -50,23 +52,23 @@ New standard library equivalent functions:

Improvements:

* Copy of objects from and to a same storage is performed directly on remote
* Copy of objects from and to the same storage is performed directly on remote
server if possible.
* Pycosio now raises ``io.UnsupportedOperation`` if an operation is not
compatible with the current storage, this apply to all newly created function
compatible with the current storage, this applies to all newly created function
and following existing functions: ``getsize``, ``getmtime``, ``mkdir``.

Fixes:

* ``io.BufferedIOBase.read`` now returns empty bytes instead of raising
exception when trying to read if seek already at end of file.
exception when trying to read if seek already at end of the file.
* ``copy`` destination can now be a storage directory and not only a local
directory.
* ``copy`` now checks if destination parent directory exists and if files
are not same file and raise proper exceptions.
are not the same file and raise proper exceptions.
* ``mkdir``: missing ``dir_fd`` argument.
* ``isdir`` now correctly handle "virtual" directories (Directory that don't
exist as proper object, but exists in another object path).
exist as a proper object, but exists in another object path).

1.1.0 (2018/10)
---------------
Expand All @@ -84,39 +86,38 @@ Backward incompatible change:
Improvements:

* No buffer copy when using ``io.BufferedIOBase.read`` with exactly
buffer size. This may lead performance improvement.
buffer size. This may lead to performance improvement.
* Minimum packages versions are set in setup based on packages changelog or
date.

Fixes:

* ``isfile`` now correctly returns ``False`` when used on directory.
* ``isfile`` now correctly returns ``False`` when used on a directory.
* ``relpath`` now keeps ending ``/`` on cloud storage path (Directory marker).

1.0.0 (2018/08)
---------------

First version that implement the core machinery.
The first version that implements the core machinery.

Provides cloud storage equivalent functions of:

* ``open`` / ``io.open``, ``shutil.copy``, ``os.path.getmtime``,
``os.path.getsize``, ``os.path.isfile``, ``os.path.relpath``.

Provides cloud objects abstract classes with following interfaces:
Provide cloud objects abstract classes with the following interfaces:

* ``io.RawIOBase``, ``io.BufferedIOBase``.

Adds support for following cloud storage:
Add support for following cloud storage:

* Alibaba Cloud OSS
* AWS S3
* OpenStack Swift

Adds read only generic HTTP/HTTPS objects support.
Add read-only generic HTTP/HTTPS objects support.

Known issues
------------

* Append mode don't work with ``ObjectBufferedIOBase``.
* ``unsecure`` parameter is not supported on Google Cloud Storage.
* Append mode doesn't work with ``ObjectBufferedIOBase``.
Loading

0 comments on commit 7739890

Please sign in to comment.