Files are already on S3, how to get MediaCloud to recognize them?
I am currently using WPEngine and their
largefsservice which puts the uploads directory on S3 but it is transparent to Wordpress.
What I would like to do now is install MediaCloud but not have to migrate all the images to the cloud (because they are already there, even though Wordpress doesn't know it).
Is there a way to tell MediaCloud to just use the images already in my S3 bucket? I've seen that you have a new CLI tool to migrate from the HumanMade S3 plugin and I was wondering how exactly that works and if I could maybe modify it to do what I need?
I would assume MediaCloud would need to go through my whole media library and verify that the file exists in S3 and then write some metadata so it knows to serve that file from S3. Is that similar to what
wp mediacloud migrateS3Uploadsdoes?
There's the Import from Cloud Storage that will do what you want. Just make sure to specify "Import Only" so it doesn't try to download anything. The function will match up what's on S3 with what's in your Media Library (well in normal situations anyways, not sure with WPEngine's largefs tbh).
Great! I'll give that a try. Does the
importFromCloudCLI command have the
offsetparameters to do it in batches?
I'm trying the WP-CLI command on an install with 117,000 images, it runs for about half an hour and then Putty gives an error message "Software cause connection abort". Trying it from the WP backend now but I don't think that will fare any better.
I think putty is closing the connection due to inactivity on your end.
I'd recommend running the command in a
screensession so if you do get disconnected, it'll still be running when you return.
You could also try
nohup mediacloud whatever
nohupwould actually be easiest I think.
Once you start the command with
nohup, it'll generate a log file called
nohup.out. You can then watch the progress via tail:
tail -f nohup.out
So I'm running it via
nohupbut what should I be seeing in the output?
Actually when I ran
wp mediacloud importFromClouddirectly from the command line I got no output either. Should it be logging every media file as it goes?
Yes, but if you have 112,000 images it's going to take time for Media Cloud to assemble the list of files to import.
You should also be able to see the progress via WordPress admin in the "Import from ..." page.
What service provider are you using? S3?
Do you still see the command running via:
ps aux | grep mediacloud
I do see it running:
wpe-user 45 17.0 0.8 562760 125940 pts/0 S 02:31 0:10 php /usr/local/bin/wp mediacloud importFromCloud --import-only --skip-thumbnails
I also figured out how to set Putty to send keepalives to prevent the terminal from closing. So I'll let it run overnight and see what happens.
Hopefully WPEngine won't kill it.
Let me know how it goes.
Looks like it was killed
+ Killed nohup wp mediacloud importFromCloud --import-only --skip-thumbnails
Gonna have to contact WPEngine support and maybe they can run it.
Can you use the Storage Browser?
I asked WPEngine tech support to run the import command from their end. It's been running for 24 hours now. Does that seem reasonable for 117,000+ images?
Dunno, never run 117K images. Even though you --skip-thumbnails, Media Cloud still has to sort through all of that, so if it's 117K + thumbnails, I mean that's really 300K entries it has to filter.
It could take awhile.
I could add "filtering" which would allow you to run batches, but I can't get to that until tomorrow my time (I live in Vietnam, it's 11AM right now).
Does the Storage Browser load?
I can get the storage browser to load on the backend. I haven't tried importing from that screen.
I think adding filtering would be ideal, then at least I can see how long it takes for say 100 files. Just so I have a feel for if it is working. No big rush on this, you are super responsive as it is so take your time!
I'm such an idiot.
There's actually already a path filter for the import command on the command line:
wp mediacloud importFromCloud --import-path=your/path/ --import-only --skip-thumbnails
I looked into adding a file name filter, but that's not possible with S3 or S3 compatible cloud storage providers.
Ah, I thought you were talking about adding a
offsetability. But I'll try the path filtering to do it in batches.
I just heard from WPEngine that the current status is
[2715 of 458352]which seems awful slow (it's been over 24 hours) but I'm inclined to just let it keep going.