Azure append blob storage does not support spark textFile API

Hello,

When I run sc.textFile('/path to an append blob'), I got the following error. 

Caused by: com.microsoft.azure.storage.StorageException: Incorrect Blob type, please use the correct Blob type to access a blob on the server. Expected BLOCK_BLOB, actual UNSPECIFIED.
at com.microsoft.azure.storage.blob.CloudBlob$8.preProcessResponse(CloudBlob.java:1306)
at com.microsoft.azure.storage.blob.CloudBlob$8.preProcessResponse(CloudBlob.java:1272)
at com.microsoft.azure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:146)
at com.microsoft.azure.storage.blob.CloudBlob.downloadAttributes(CloudBlob.java:1265)
at com.microsoft.azure.storage.blob.BlobInputStream.<init>(BlobInputStream.java:155)

It seems that spark could only read data from block blob. I check the azure storage sdk in HDinsight, it is version 2.2.0. While the append blob just added in version azure-storage 3.0.0. My question is that does azure-storage 3.0.0 support spark textFile API? If yes, how could I update azure-storage to the latest version in HDInsight?

Thanks

Jun


September 14th, 2015 6:13pm