Simple implementation of N-Gram, tf-idf and Cosine similarity in Python

Check out NLTK package: http://www.nltk.org it has everything what you need For the cosine_similarity: def cosine_distance(u, v): “”” Returns the cosine of the angle between vectors v and u. This is equal to u.v / |u||v|. “”” return numpy.dot(u, v) / (math.sqrt(numpy.dot(u, u)) * math.sqrt(numpy.dot(v, v))) For ngrams: def ngrams(sequence, n, pad_left=False, pad_right=False, pad_symbol=None): “”” … Read more

Difference between screen.availHeight and window.height()

window.outerHeight It’s the height of the window on screen, it includes the page and all the visible browser’s bars (location, status, bookmarks, window title, borders, …). This not the same as jQuery’s $(window).outerHeight(). window.innerHeight or $(window).height() It’s the height of the viewport that shows the website, just the content, no browser’s bars. document.body.clientHeight or $(document).height() … Read more

Create a folder inside documents folder in iOS apps

I do that the following way: NSError *error; NSArray *paths = NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES); NSString *documentsDirectory = [paths objectAtIndex:0]; // Get documents folder NSString *dataPath = [documentsDirectory stringByAppendingPathComponent:@”/MyFolder”]; if (![[NSFileManager defaultManager] fileExistsAtPath:dataPath]) [[NSFileManager defaultManager] createDirectoryAtPath:dataPath withIntermediateDirectories:NO attributes:nil error:&error]; //Create folder

javascript to get paragraph of selected text in web page

This is actually rather hard to do because you have to consider six cases: The selection is not within a paragraph (easy); The entire selection is within one paragraph (easy); The entire selection crosses one or more sibling paragraphs (harder); The selection starts or ends in an element not within a paragraph (harder); The paragraphs … Read more

How to get the size of single document in Mongodb?

In the previous call of Object.bsonsize(), Mongodb returned the size of the cursor, rather than the document. Correct way is to use this command: Object.bsonsize(db.test.findOne()) With findOne(), you can define your query for a specific document: Object.bsonsize(db.test.findOne({type:”auto”})) This will return the correct size (in bytes) of the particular document.

How do I load an org.w3c.dom.Document from XML in a string?

Whoa there! There’s a potentially serious problem with this code, because it ignores the character encoding specified in the String (which is UTF-8 by default). When you call String.getBytes() the platform default encoding is used to encode Unicode characters to bytes. So, the parser may think it’s getting UTF-8 data when in fact it’s getting … Read more