Thursday, December 27, 2018

"this app can break"

“This app can break”

• When we type this statement on notepad ,save it and if we open the same file we see the following code as output

[][][][][][][][][]

 This is due to the following explanation
----------------------------------------------------

• Phrases like “this app can break” are generally termed as “hoaxes”
• When we type such sentences the Notepad tries to guess the file encoding and instead it makes a wrong guess.
• While we are saving the text in notepad it is saving in the ANSI format and when we open the same text it is opened in Unicode format or encoding format by default.
• Notepad makes use of a built-in window class named "EDIT". Up to Windows 95 was the only available font for Notepad. Windows NT 4.0 and 98 introduced the ability to change this font. In Windows 2000 and XP the default font was changed to Lucida Console.
• Notepad can edit traditional 8-bit text files as well as Unicode text files such as UTF-8 and UTF-16.
• Notepad accepts text from the Windows ‘clipboard’. When clipboard data with multiple formats is pasted into Notepad, the program will only accept text in the CF_TEXT format.
• The Windows NT version of Notepad, installed by default on Win2000 & XP, has the ability to detect Unicode files even when they are missing a byte order mark.
• It utilizes a Windows API function called “IsTextUnicode ()”—and if we pass some data to it, and it tells you whether it's UTF-16-encoded or not.
• This Function is imperfect, incorrectly identifying some all-lowercase ASCII text as UTF-16.

• As a result, Notepad interprets a file containing a phrase like "aaaa aaa aaa aaaaa" as two-byte Unicode text file and attempts to display as it is.
• Text files containing Unicode like UTF-16-encoded Unicode are supposed to start with a "Byte-Order Mark" (BOM), which is a two-byte flag, that tells a reader how the following UTF-16 data is encoded. Since these two bytes are exceedingly unlikely to occur at the beginning of an ASCII text file, it's commonly used to tell whether a text file is encoded in UTF-16.
• WinCustomize.com discovered an odd bug in Notepad that's triggered by a text file consisting of a four-letter word, two three-letter words, and a five letter word. Some of the examples of such sentences are
a) bush hid the facts
b) 1111 111 111 11111
c) this txt are longs

• Since the notepad do not have the functionality to recognize the format that it is in.
• If we supply to the notepad the type of the font that it has to use i.e..in “Arial Unicode MS” then it retains the original format text.

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home