The entire semester has been a very trying one, yet very interesting and fascinating at the same time. Learning about finite state automata, understanding strings as mathematical sets, program correctness, context-free grammar all truly tested my problem solving skills.
Recently, I solved a problem that asked us to Prove that any regular expression that doesn't include a Kleene star denotes a finite language.
Initially, I attempted to solve this problem using induction. This turned out to be an extremely complex approach, instead, we were directed to look at the definition of what makes a language and its regular expression properties.
By studying the notes, we see that any language that is Kleene star, it also has the properties of (R+S) and (RS). So if we can prove the latter of the two, we have satisfied our problem.
Basis: 0, epsilon, and a (for each a in sigma) are in the RE of sigma
Induction step: If R and S are in RE of sigma (ie: R and S are regular
expressions of sigma), then so are (R+S), (RS) and R*.
We know by definition that the Kleene star for any language is the set of all
possible concatenations of 0 or more strings in L. Hence, since (R+S) and (RS)
both are also valid regular expressions of sigma, by proving that those
are the regular expressions of finite languages, our proof is complete.
To prove this, we examine the definitions of what makes a regular
expression for a language L.
Let P(n) represent the regular expressions (R+S) and (RS) for the language L
where L is a finite language.
Let the language be denoted by a regular expression R be called L(R).
Base case: L(0) = 0, L(epsilon)={epsilon}, L(a) = {a}. All of these regular
expressions denote a finite language. Since L(0) only contains 0 strings, which
is finite. L(epsilon) contains the finite string epsilon, and L(a) contains
the finite string a.
Inductive hypothesis: if L(R) and L(S) have been defined, then L(R+S) is finite
and L(RS) is also finite.
Inductive step: We have to prove two cases: that L(R+S) is finite and that L(RS)
is finite.
By definition, the property RS means that for every xy where x and y are in the
set of natural numbers, x is an element of R and y is an element of S. Secondly,
if |R| = k (for some k in the set of natural numbers) and |S| = l (for some l
in the set of natural numbers), then the upper bound for the maximum possible
concatenations of RS is kl.
Similarly, the property of R+S means that for every x that is in R and ever y
that is in S, R+S = x+y. By definition of union, R+S may consist of x, or
consist of y or neither x nor y for every x and y in the set of natural numbers.
Thus, the upper bound for |R+S| is x+y.
Since RS and R+S both have an upper bound, we have shown their maximum
combination of strings possible. Thus, we have demonstrated that L(RS) and
L(R+S) both denote finite languages since they can only contain a maximum
number of strings.
Therefore, any regular expression that does not include a Kleene star denotes
a finite language.
Friday, December 5, 2008
Subscribe to:
Comments (Atom)