First, if the data was accidentally added, you can remove the data files that cause the difference in schema, drop the partition, and re-crawl the data. Also i don't feel that it would fall into Athena's charter as it is just an analysis engine on data stored somewhere. They get billed only for the queries they execute. P.S. After the upload, Athena would tranform the data again and the deleted rows won't show up. Athena does not have that support as of now. run aws athena sql scripts wither from CLI or as Lambda - QSFT/athena-cmd On paper, this seemed equivalent to and easier than mounting the data as ⦠Is it possible to delete data stored in S3 through an Athena query? DELETE - Amazon Redshift, Removes the metadata table definition for the table named table_name . If PERCENT is specified, then the top rows are based on a top_value percentage of the total result set (as specfied by the PERCENT value). If the WHERE clause is specified, only the matching rows are deleted. I think it is the most simple way to go. I wish to select only rows with event_id 303, so I do this. You can leverage Athena to find out all the files that you want to delete and then delete them separately. mongoexport fields from subdocuments to csv. Character » Athena appears in 25 issues. https://docs.aws.amazon.com/athena/latest/ug/ctas.html, Later you can replace the old files with the new ones created by CTAS. I am trying to drop few tables from Athena and I cannot run multiple DROP queries at same time. This is cool, thanks for sharing, but I can't delete the entire file, I need to delete specific lines in the files with the bad data. Does not support timestamp with time zone. Were you able to find a solution for this problem, like a custom solution? Select "$path" from where