I'm trying to download some files from google cloud storage (log files from a google play published app).
My code looks like this
Environment.SetEnvironmentVariable("GOOGLE_APPLICATION_CREDENTIALS", "my-service-account-credential.json", EnvironmentVariableTarget.Process);
StorageClient storageClient = StorageClient.Create();
var bucketName = "mybucketname";
var buckets = storageClient.GetBucket(bucketName);
var objects = storageClient.ListObjects(bucketName).ToList();
foreach (var o in objects)
{
try
{
Directory.CreateDirectory(Path.GetDirectoryName(o.Name));
using (var fs = File.Open(o.Name, FileMode.OpenOrCreate))
{
await storageClient.DownloadObjectAsync(bucketName, o.Name, fs);
}
}
catch (Exception e)
{
if (e.Message.StartsWith("Incorrect hash"))
{
continue;
}
throw;
}
}
The code actually seems to work fine (judging by looking at the actual downloaded file content, it is csv files). But as you can see I have implemented a nasty try catch / hack because every file I download throws an exception stating that the hash is incorrect. I'm assuming the client library compare the hash of the downloaded content with the hash of the bucket and these are not identical resulting it an exception.
The exception is:
System.IO.IOException: Incorrect hash: expected 'DXpVGw==' (base64), was '2RMrcw==' (base64)
at Google.Cloud.Storage.V1.StorageClientImpl.<DownloadObjectAsyncImpl>d__48.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
at MyClass.GoogleBucket.Functions.<DownloadGoogleBucketLogs>d__1.MoveNext() in mycode.cs:line 51
So my question is how do you download objects, without getting this exception, clearly one is not supposed to do what I have done.
TL;DR: Update to 2.1.0 when that's out. (Or fetch and build the source before then if you're desperate.)
This was a tricky one to fix.
The issue is that HttpClient
was automatically decompressing the data on the fly, but the hash provided by the server was for the compressed content.
We've now made changes to both the REST API support library and the Google.Cloud.Storage.V1
library to intercept and hash the downloaded data before decompression. The changes are merged in Github, and will be in the 2.1.0 release which I expect to happen in early January.
Note that this won't fix a separate corner case where client-side decompression is disabled, resulting in server-side decompression, but still with a hash of the compressed content. We're tracking that separately but it wouldn't affect the sample code here, as you'd only see it if you created a StorageService
explicitly, disabled gzip support in the initializer, and then created a StorageClient
to wrap that service.
See more on this question at Stackoverflow