More

aliostad · on Jan 29, 2020

Here is the terminal output of what I did to remove the file from the git history:

~/g/aliostad bfg --delete-files 1703 deep-learning-lang-detection.git

Using repo : /Users/alikheyrollahi/github/aliostad/deep-learning-lang-detection.git

Found 72811 objects to protect Found 2 commit-pointing refs : HEAD, refs/heads/master

Protected commits -----------------

These are your protected commits, and so their contents will NOT be altered:

* commit ac12aa68 (protected by 'HEAD') - contains 8 dirty files : - data/stackoverflow-snippets/cpp/1703 (3.0 KB) - data/stackoverflow-snippets/csharp/1703 (835 B) - ...

WARNING: The dirty content above may be removed from other commits, but as the protected commits still use it, it will STILL exist in your repository.

Details of protected dirty content have been recorded here :

/Users/alikheyrollahi/github/aliostad/deep-learning-lang-detection.git.bfg-report/2020-01-27/22-24-03/protected-dirt/

If you really want this content gone, make a manual commit that removes it, and then run the BFG on a fresh copy of your repo.

Cleaning --------

Found 69 commits Cleaning commits: 100% (69/69) Cleaning commits completed in 304 ms.

Updating 1 Ref --------------

Ref Before After --------------------------------------- refs/heads/master | ac12aa68 | c51406cc

Updating references: 100% (1/1) ...Ref update completed in 13 ms.

Commit Tree-Dirt History ------------------------

Earliest Latest | | .................................................DDDDDDDDDDm

D = dirty commits (file tree fixed) m = modified commits (commit message or parents changed) . = clean commits (no changes to file tree)

                         Before     After
 -------------------------------------------
 First modified commit | a4a1bbac | cb32cfbf
 Last dirty commit     | 45322921 | 6b9e8d5d

Deleted files -------------

Filename Git id --------------------------------------------------- 1703 | 530293d7 (614 B), 98c9b646 (3.0 KB), ...

In total, 47 object ids were changed. Full details are logged here:

/Users/alikheyrollahi/github/aliostad/deep-learning-lang-detection.git.bfg-report/2020-01-27/22-24-03

BFG run is complete! When ready, run: git reflog expire --expire=now --all && git gc --prune=now --aggressive

-- You can rewrite history in Git - don't let Trump do it for real! Trump's administration has lied consistently, to make people give up on ever being told the truth. Don't give up: https://www.aclu.org/ --

~/g/aliostad cd deep-learning-lang-detection.git ~/g/a/deep-learning-lang-detection.git git reflog expire --expire=now --all && git gc --prune=now --aggressive Enumerating objects: 89539, done. Counting objects: 100% (89539/89539), done. Delta compression using up to 8 threads Compressing objects: 100% (89537/89537), done. Writing objects: 100% (89539/89539), done. Total 89539 (delta 28336), reused 61123 (delta 0) ~/g/a/deep-learning-lang-detection.git git push Enter passphrase for key '/Users/alikheyrollahi/.ssh/id_rsa': Enumerating objects: 89539, done. Counting objects: 100% (89539/89539), done. Delta compression using up to 8 threads Compressing objects: 100% (61201/61201), done. Writing objects: 100% (89539/89539), 40.83 MiB | 1.01 MiB/s, done. Total 89539 (delta 28336), reused 89539 (delta 28336) remote: Resolving deltas: 100% (28336/28336), done. To github.com:aliostad/deep-learning-lang-detection.git + ac12aa680...c51406cc8 master -> master (forced update) ~/g/a/deep-learning-lang-detection.git cd ..

aliostad · on Jan 29, 2020

This is essentially what I did using bfg tool. They still took it down.

https://help.github.com/en/github/authenticating-to-github/r...

zegerjan · on Jan 29, 2020

I'm saying the blob was in a new repository, you had no control over. You couldn't have removed it, you could only make sure it doesn't get referenced in _your_ repository. Which is what you did.

aliostad · on Jan 29, 2020

sure, but I imagine they have already removed those forks too.

Have a look at this list https://github.com/github/dmca/blob/master/2020/01/2020-01-2...

aliostad · on Jan 29, 2020

Lack of communication and courtesy - disrespect to public good. I am happy to remove the mention per his request.

tastroder · on Jan 29, 2020

Fair enough, communication could indeed be improved. NAL but I was under the impression that, while they could surely be more helpful here, once they received that official DMCA takedown notice they don't really have a choice in the matter of taking it down or not.

Edit: disabling the repository after being notified by you within 24 hours seems to be against their own policy at https://help.github.com/en/github/site-policy/dmca-takedown-... - have you tried contacting their support again?

aliostad · on Jan 29, 2020

Well they said, I had 24 hours to remove the offending item according to the "remove sensitive data" link which I abided in a matter of a few minutes. They still took down the repo - that is the problem, not sending the notice.

"We're giving you 24 hours to make the changes identified in the following notice:

https://github.zendesk.com/attachments/token/BqByLyvvRzOAmVy...

If you need to remove specific content from your repository, simply making the repository private or deleting it via a commit won't resolve the alleged infringement. Instead, you must follow these instructions to remove the content from your repository's history, even if you don't think it's sensitive:

https://help.github.com/articles/remove-sensitive-data

aliostad · on May 25, 2017

My understanding is that they hand just off the payload to another queueing mechanism. So it just returns ACK and done.

aliostad · on May 25, 2017

this is yet the "more detailed" post. Here is the original story https://customers.microsoft.com/en-US/story/raygun which I objected to https://twitter.com/aliostad/status/849186214045528064

aliostad · on May 10, 2017

We just did a benchmarking for a PoC on DocumentDB side-by-side Cassandra. It does the job, I have not yet seen anything revolutionary. Cassandra benchmarks seemed better.

aravindkr1 · on May 10, 2017

One key difference is the cost difference between running Cosmos DB and Cassandra. We have a TCO paper (https://aka.ms/documentdb-tco-paper) that shows that for a 1M operations/second workload, Cosmos DB is significantly 3x-10x cheaper than other systems.

brianwawok · on May 11, 2017

And white papers are full of it. Has random things like s Cassandra cluster needs 1 full time engineer per 100 nodes. Where do you come up with this stuff?

throwanem · on May 11, 2017

In my experience of Cassandra (working on an application client, not managing it), one FTE per hundred nodes is extremely generous.

bshanks · on May 11, 2017

Do you mean that you typically need more engineers, or less?

throwanem · on May 11, 2017

The former. One FTE per dozen nodes is more in line with what I've seen in practice.

rattray · on May 10, 2017

Given that this makes bolder claims that Cassandra, near-parity is pretty impressive, no?

(I don't know a ton about either technology, please correct me if I'm wrong)

aliostad · on May 10, 2017

DocumentDB is KV.

rattray · on May 10, 2017

DocumentDB looks like a JSON document store?

hoschicz · on May 10, 2017

It is.

aliostad · on June 20, 2016

Thanks for taking the survey. And bear in mind after all it will be your data :)

aliostad · on April 22, 2016

For those who had the opportunity and pleasure of meeting you personally, the day the news broke out was a black day. You are a person who makes a deep impression, your thoughtfulness and very balanced view and how you articulate them. I now read your writings and find them even more compelling: sharp observation and bravery to spell the truth out.

Death is coming to all of us. We all die. Death of some, however, will be a big loss. You, sir, are among them.

aliostad · on Dec 29, 2015

Data and Viz @ https://github.com/aliostad/wiki-rock