Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 38 additions & 40 deletions applications/cybersecurity/whitebox_fuzzing/whitebox_fuzzing.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"tags": []
},
"source": [
" # Using Quantum Computers to Boost Whitebox Fuzzing"
"# Using Quantum Computers to Boost Whitebox Fuzzing"
]
},
{
Expand All @@ -20,13 +20,11 @@
"tags": []
},
"source": [
"## Introduction\n",
"This demonstration shows how to harness the power of quantum computers for **enhancing software security**. Specifically, it uses the quantum Grover algorithm to boost the process of whitebox fuzzing.\n",
"\n",
"In this demonstration we will show how to harness the power of quantum computers for **enhancing software security**. Specifically, We will use the Quantum Grover's algorithm for boosting the process of whitebox fuzzing.\n",
"According to [[1](#Whitebox)], the \"killer-app\" for whitebox fuzzing is the **testing of file and packet parsers**. As any vulnerability in such a parser might result in a costly security patch, it is worthwhile investing significant effort to protect the code.\n",
"\n",
"According to [[1](#Whitebox)], the \"killer-app\" for whitebox fuzzing is the **testing of file\\packet parsers**. Any vulnerability of such parser might result with a very costy security patch, so one would want to invest much effort in validating that such code is protected.\n",
"\n",
"### Fuzzing\n",
"## Fuzzing\n",
"\n",
"<center><img src=\"https://docs.classiq.io/resources/fuzzing.jpeg\" width=700/></center>"
]
Expand Down Expand Up @@ -56,12 +54,12 @@
"source": [
"### Whitebox Fuzzing\n",
"\n",
"Whitebox fuzzing, in particular, involves having access to the internal structure and code of the program. It combines static and dynamic analysis to not only execute the program with random inputs but also to achieve maximum code coverage, ensuring that all possible execution paths have been tested. This allows for more targeted and efficient testing. It usually consists of a \"symbolic execution\" part, in which the program is emulated in order to explore various branches, and gathered into a set of contraints.\n",
"Whitebox fuzzing, in particular, involves accessing the internal structure and code of the program. It combines static and dynamic analysis to not only execute the program with random inputs but also to achieve maximum code coverage, ensuring that all possible execution paths are tested. This allows for more targeted and efficient testing. It usually consists of a \"symbolic execution\" part: emulating the program to explore various branches and gathering them into a set of constraints.\n",
"\n",
"The constraints are solved by a constraint solver in order to generate new fuzzing input to the program.\n",
"The constraints are solved by a constraint solver, generating new fuzzing input to the program.\n",
"\n",
"### Toy Example\n",
"Let us emphasize the importance of whitebox testing in the following toy example. Take the following function, for which we want to trigger all relevant code flows:"
"This example emphasizes the importance of whitebox testing. First, trigger all code flows for this function:"
]
},
{
Expand Down Expand Up @@ -93,8 +91,8 @@
"id": "4cef65e3-644f-4615-824d-7ff52c5fb7c3",
"metadata": {},
"source": [
"Now (due to simulation limitations) say that x, y are 6 bit integers, so they are in the range [0, 63].\n",
"Let's try to get all the outputs of `foo` in a black-box way, e.g. by sampling random inputs."
"Now (due to simulation limitations) say that x, y are six-bit integers, so they are in the range [0, 63].\n",
"Try to get all the outputs of `foo` in a black-box way, e.g., by sampling random inputs:"
]
},
{
Expand Down Expand Up @@ -149,9 +147,9 @@
"id": "1d89ee0c-cdbf-4e0d-a2f8-5d402c5158e5",
"metadata": {},
"source": [
"We Can see that with 500 inputs, we only reached 3 out of the 5 different outputs for `foo`.\n",
"Note that with 500 inputs, you only reach three of the five different outputs for `foo`.\n",
"\n",
"However, by following the flow of `foo`, we can generate the following constraints to the function:\n",
"However, by following the flow of `foo`, you can generate these constraints to the function:\n",
"- \"a\": $ (x = 12) \\land (y \\gt 3) $\n",
"- \"b\": $ (x = 12) \\land (y \\leq 3) $\n",
"- \"c\": $ (x + y \\lt 9) \\land (x \\neq 12) \\land (x \\times y \\mod 4 = 1)$\n",
Expand All @@ -164,7 +162,7 @@
"id": "2f88fe21-85b5-4bab-838a-ce781dc91f1f",
"metadata": {},
"source": [
"Now, In order to trigger each of the different outputs, we need to find inputs that satisfy the constraints. Some of the constraints might not be satisfiable for any input. Although this toy example is easy, the general case, which is an instance of Constraints Satisfaction Problem (CSP), is computationaly hard and belongs to the $\\text{NP-Complete}$ complexity class."
"Now, to trigger each of the different outputs, find inputs that satisfy the constraints. Some constraints might not be satisfiable for any input. Although this toy example is easy, the general case, which is an instance of the Constraints Satisfaction Problem (CSP), is computationally hard and belongs to the $\\text{NP-Complete}$ complexity class."
]
},
{
Expand All @@ -175,10 +173,10 @@
"## Here Comes the Quantum Part!\n",
"\n",
"### Grover's Algorithm\n",
"The physical nature of quantum computer can be harnessed to generate inputs to the function. Specifically, we can create a 'superposition' of all different inputs - a physical state which holds all the possible assignments to the function inputs, for which we compute whether the constraints is fulfilled. The quantum computer allows only a single classical output of the variable. That where Grover's algorithm [[1](#Gro97),[2](#GroWiki)] takes place - it can generate \"good\" samples with a high probability, achieving a quadratic(!) speedup over a classical brute-force approach.\n",
"The physical nature of a quantum computer can be harnessed to generate inputs to the function. Specifically, you can create a 'superposition' of all different inputs: a physical state that holds all the possible assignments to the function inputs, for which to compute whether the constraints are fulfilled. The quantum computer allows only a single classical output of the variable. This is where Grover's algorithm [[1](#Gro97),[2](#GroWiki)] is useful: it can generate \"good\" samples with a high probability, achieving a quadratic(!) speedup over a classical brute force approach.\n",
"\n",
"### Oracle Function\n",
"In the heart of the algorithm, one needs to implement an oracle that compute for each state:\n",
"In the heart of the algorithm, implement an oracle that computes for each state:\n",
"\n",
"$\n",
"O |x\\rangle =\n",
Expand All @@ -188,11 +186,11 @@
"\\end{cases}\n",
"$\n",
"\n",
"Classiq has a built-in Arithmetic engine for computing such oracles. Specifically, lets take the hardest constraint:\n",
"Classiq has a built-in arithmetic engine for computing such oracles. Specifically, take the hardest constraint:\n",
"\n",
"* **\"c\": $ (x + y < 9) \\land (x \\neq 12) \\land (x \\times y \\mod 4 = 1)$**\n",
"\n",
"We eliminate the $x \\neq 12$ as it is already satisfied given the first clause, and create a predicate function for it:"
"Eliminate the $x \\neq 12$ as it is already satisfied given the first clause, and create a predicate function for it:"
]
},
{
Expand All @@ -216,7 +214,7 @@
"id": "e0495ab1-cfde-4bd3-9488-3cc7417cfef2",
"metadata": {},
"source": [
"And we create a phase oracle"
"Create a phase oracle:"
]
},
{
Expand Down Expand Up @@ -244,7 +242,7 @@
"id": "e24742e5-11e5-4298-a134-db9699030a51",
"metadata": {},
"source": [
"Let's see how a quantum oracle looks like:"
"See how a quantum oracle looks:"
]
},
{
Expand Down Expand Up @@ -311,7 +309,7 @@
"id": "b7cd7593-a777-46a1-8a62-bdec7c1fae27",
"metadata": {},
"source": [
"Now create the full circuit implementation the Grover's algorithm:"
"Now create the full circuit implementation of Grover's algorithm."
]
},
{
Expand All @@ -327,7 +325,7 @@
"id": "a11a3fa1-5131-47d8-b03e-cbe31d4c7309",
"metadata": {},
"source": [
"Load the uniform superposition state - over all possible input assignments to `foo`, by using the `hadamard_transform`."
"Load the uniform superposition state over all possible input assignments to `foo` using the `hadamard_transform`:"
]
},
{
Expand Down Expand Up @@ -392,7 +390,7 @@
"id": "a3b4d267-e8b4-4b60-bc40-ee52bc4d0443",
"metadata": {},
"source": [
"The algorithm includes applying a quantum oracle in repetition, such that the probability to sample a good state \"rotates\" from low to high. Without knowing the concentration of solution before hand (which is the common case), one might overshoot with too many repetitions and not get a solution. Fixed Point Amplitude Amplification (FFPA) ([4](#FFPA)) for example, is a modification to the basic Grover algorithm, which does not suffer from the overshoot issue. However, here for simplicity we will use the basic Grover's Algorithm. We assume that this specific state only satisfied for a specific input, and calculate the number of oracle repetitions required:"
"The algorithm includes applying a quantum oracle in repetition, such that the probability to sample a good state \"rotates\" from low to high. Without knowing the concentration of solutions beforehand (which is the common case), one might overshoot with too many repetitions and not arrive at a solution. Fixed Point Amplitude Amplification (FFPA) ([4](#FFPA)), for example, is a modification to the basic Grover algorithm, which does not suffer from the overshoot issue. However, here, for simplicity, use the basic Grover's algorithm. Assume that this specific state is only satisfied for a specific input, and calculate the number of oracle repetitions required:"
]
},
{
Expand Down Expand Up @@ -420,15 +418,15 @@
"id": "6572e941-0b83-46d2-826a-fa4ab9406503",
"metadata": {},
"source": [
"This is indeed ~ the square root of the number of possible assignment - $2^{12}$ in this case!"
"This is indeed ~ the square root of the number of possible assignments: $2^{12}$ in this case!"
]
},
{
"cell_type": "markdown",
"id": "0e513d93-948a-4390-a316-2a417879fb18",
"metadata": {},
"source": [
"In order to save simulation time, will simplify even further, and use only several grover repetitions, to show that this raises the probability to sample a \"c\" input."
"To save simulation time, simplify even further: use only several Grover repetitions to show that this raises the probability of sampling a \"c\" input:"
]
},
{
Expand Down Expand Up @@ -460,7 +458,7 @@
"id": "b224aa49-4a3b-4ed3-ab59-6f47ca9840b6",
"metadata": {},
"source": [
"We will also set constraints for the resulting circuit, so we can simulate it on a quantum simulator:"
"Set constraints for the resulting circuit so you can simulate it on a quantum simulator:"
]
},
{
Expand All @@ -486,7 +484,7 @@
"source": [
"### Synthesizing the Model\n",
"\n",
"We proceed by synthesizing the circuit using Classiq's synthesis engine. The synthesis takes the should takes the high-level model definition and creates a quantum circuits implementation. It should take approximately several seconds:"
"Synthesize the circuit using the Classiq synthesis engine. The synthesis takes the high level model definition and creates a quantum circuit implementation within a few seconds:"
]
},
{
Expand All @@ -513,7 +511,7 @@
"source": [
"### Showing the Resulting Circuit\n",
"\n",
"After Classiq's synthesis engine has finished the job, we can show the resulting circuit in the interactive GUI:"
"When the Classiq synthesis engine finishes the job, display the resulting circuit in the interactive GUI:"
]
},
{
Expand Down Expand Up @@ -543,9 +541,9 @@
}
},
"source": [
"### Executing the circuit\n",
"### Executing the Circuit\n",
"\n",
"Lastly, we can execute the resulting circuit with Classiq's execute interface, using the `execute` function."
"Lastly, execute the resulting circuit in the Classiq interface using the `execute` function:"
]
},
{
Expand Down Expand Up @@ -585,7 +583,7 @@
"id": "495cb8a0-02bc-4c2c-8a7c-970ccba7e884",
"metadata": {},
"source": [
"These are the counts of the sampled bit-strings, out of the 500 samples we drew from the circuit. You can see that few of the bit strings were sampled with a much higher probability. Let's Extract `x` and `y` values from the sampling results, to see whether we've got inputs for \"c\":"
"These are the counts of the sampled bit strings, from the 500 samples drawn from the circuit. Some of the bit strings are sampled with a much higher probability. Extract `x` and `y` values from the sampling results to determine if you have inputs for \"c\":"
]
},
{
Expand Down Expand Up @@ -622,7 +620,7 @@
"id": "62c1be9f-cfac-487b-95ff-2d05d05b07cc",
"metadata": {},
"source": [
"Which exactly corresponds to the 4 peaks in the results."
"This corresponds exactly to the four peaks in the results:"
]
},
{
Expand Down Expand Up @@ -653,7 +651,7 @@
"id": "dd171efd-69db-4513-95e9-ed66fbc3c895",
"metadata": {},
"source": [
"And \"c\" was indeed sampled with higher probability! If we did 36 grover repetitions, we would expect to get c with probability ~1."
"And \"c\" is indeed sampled with a higher probability! If you do 36 Grover repetitions, you would expect to get c with a probability of ~1."
]
},
{
Expand All @@ -662,8 +660,8 @@
"metadata": {},
"source": [
"## Notes\n",
"- While potentially \"black-box\" fuzzing can also benefit from quantum computers, it requires in general a high amount of quantum resources, in order to emulate the state of the classical program. On the otherhand, the \"white-box\" case is a lower hangning fruit, where a hybrid quantum-classical approach can be taken, with less resources required.\n",
"- Here we showed quadratic improvement in comparison to a classical \"brute-force\" solver. However in reality there are much faster classical solvers. As a basic example, a solver can \"prune\" branches of search by backtracking in case a partial assignment is already not satisfiable. Such modifications are in general feasible also on a quantum computer. For example, see [[5](#Backtrack)] on Quantum Backtracking."
"- While \"black-box\" fuzzing can also potentially benefit from quantum computers, a large quantity of quantum resources is generally required to emulate the state of the classical program. On the other hand, the \"white-box\" case is lower hanging fruit, requiring fewer resources for a hybrid quantum-classical approach.\n",
"- This example shows quadratic improvement in comparison to a classical \"brute force\" solver. However, in reality, there are much faster classical solvers. As a basic example, a solver can \"prune\" branches of a search by backtracking if a partial assignment is not satisfiable. Such modifications are, in general, also feasible on a quantum computer. For example, see [[5](#Backtrack)] on Quantum Backtracking."
]
},
{
Expand All @@ -673,15 +671,15 @@
"source": [
"# References\n",
"\n",
"<a name='Whitebox'>[1]</a>: [E. Bounimova, P. Godefroid and D. Molnar, \"Billions and billions of constraints: Whitebox fuzz testing in production,\" 2013 35th International Conference on Software Engineering (ICSE), San Francisco, CA, USA, 2013, pp. 122-131, doi: 10.1109/ICSE.2013.6606558.](https://ieeexplore.ieee.org/document/6606558)\n",
"<a name='Whitebox'>[1]</a> [Bounimova, E., Godefroid, P., and Molnar, D. (2013). Billions and billions of constraints: Whitebox fuzz testing in production, 35th International Conference on Software Engineering (ICSE), San Francisco, CA, USA, 2013, pp. 122-131, doi: 10.1109/ICSE.2013.6606558](https://ieeexplore.ieee.org/document/6606558)\n",
"\n",
"<a name='Gro97'>[2]</a>: [Grover, Lov K. \"Quantum mechanics helps in searching for a needle in a haystack.\" Physical review letters 79.2 (1997): 325.](https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.79.325)\n",
"<a name='Gro97'>[2]</a> [Grover, Lov K. (1997). Quantum mechanics helps in searching for a needle in a haystack. Physical Review Letters, 79.2: 325.](https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.79.325)\n",
"\n",
"<a name='GroWiki'>[3]</a>: [Grover's algorithm (Wikipedia)](https://en.wikipedia.org/wiki/Grover%27s_algorithm)\n",
"<a name='GroWiki'>[3]</a> [Grover's algorithm (Wikipedia)](https://en.wikipedia.org/wiki/Grover%27s_algorithm).\n",
"\n",
"<a name='FPAA'>[4]</a>: [Yoder, Theodore J. et al. Fixed-point quantum search with an optimal number of queries. Physical review letters 113 21 (2014): 210501](https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.113.210501)\n",
"<a name='FPAA'>[4]</a> [Yoder, Theodore J. et al. (2014). Fixed-point quantum search with an optimal number of queries. Physical Review Letters, 113 21: 210501](https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.113.210501)\n",
"\n",
"<a name='Backtrack'>[5]</a>: [Montanaro, Ashley. (2015). Quantum walk speedup of backtracking algorithms. Theory of Computing. 14. 10.4086/toc.2018.v014a015](https://theoryofcomputing.org/articles/v014a015/)\n"
"<a name='Backtrack'>[5]</a> [Montanaro, Ashley. (2015). Quantum walk speedup of backtracking algorithms. Theory of Computing. 14. 10.4086/toc.2018.v014a015](https://theoryofcomputing.org/articles/v014a015/)\n"
]
}
],
Expand Down
Loading