Wednesday 1 April 2015

Felina

This was certainly one heck of a semester. This whole school year in particular was filled with a lot of personal tragedies, which have certainly made fulfilling my academic obligations a challenge- particularly when it came down to the weekly SLOGS and quizzes. And in addition to all this, I ended up in the middle of a strike. As a result, 4 quizzes were cancelled along with their accompanying tutorials. Which kinda sucks, since the 10% of the mark allocated to all the quizzes originally scheduled is now being allocated to the 6 or so that actually got carried out. And since I didn't do quite as well the earlier quizzes as the later ones, I only stand to lose as a result of the strike. Now, don't confuse my feelings on the impact of the strike with my opinions on the legitimacy of the strike itself. I'm not brave enough to discuss those on a blog read by both instructors and TA's. What I wish to impart, though, is just the simple recognition that this was a messed up semester, for more than a few reasons.


However, with that said, I think I learned a tremendous amount from this course. Now, if I can be honest, I think I learned a lot more from the exercises and assignments than I did from the lectures and tutorials. In fact, I often skipped them just due to how pointless they seemed. However, I can't help but think that the course was designed to be that way. Computer science- as the lecture notes made abundantly clear- is really more about computational thinking and problem solving than about understanding one specific program, syntax or algorithm. And in order to solve problems, you need to get used to this notion of actually understanding the problem in the first place. And not just in a higher-level sense, but also in the way a computer would understand the problem- as a set of instructions to be performed sequentially. This requires you to represent the problem in many ways- in higher levels of abstraction, which ignore implementation and focus on the problem itself, and on lower levels, in which you work out the nitty gritty. This is a skill, and skills require practice. So in that sense, I am deeply grateful for the opportunities I was provided to continuously hone my skills. I can say with great confidence that, despite the hardships I've faced this semester, I came out of it as an improved person.

Saturday 28 March 2015

That Post Where I Revist An Earlier Blog Post, Reflect On How Much I've Grown, And We All Get Sentimental

This is not a SLOG I looked forward to writing. Now, as my massive hoard of readers has probably noticed, I am not a socialite. I just don't like to talk about myself very much. If you skim through my many posts, you'll notice a reoccurring theme- they're mostly concerned with my comprehension of the concepts covered in the class, not so much on my experiences regarding them. And, if I can be even more blunt, I only interact with other SLOG's to get the participation marks. So, this post will, by and large, serve as a sort of way of remedying this.

Anyways, let's just cut to the chase. It should not come as a surprise that I've learned a lot since starting this course. As new concepts stack upon older ones, the big picture of the course keeps getting more and more clear. Sometimes, something I've written in the past seems as though I could've written it now. Other times, it just looks hopelessly outdated. Now, the goal of this blog post is to find just one example of this, and to present it to the world so that the TA's laugh at me behind my back for not getting it right the first time can experience that sentimental feeling a parent gets when they watch a child utter their first word, and than shed a tear in one eye. I love you all too.

Anyways, to be honest, I have a tendency to revisit past blog posts and make changes. Usually every time I write a new one. Sometimes it's just to correct spelling and grammar mistakes. Other times it's improve my style and prose. And sometimes, if I run into a sentence which is syntactically broken beyond repair, I'll just rewrite it from scratch. And of course, whenever I revisit these past blog posts, the temptation to update them with the stuff I know in the present becomes stronger and stronger. And in some cases, the finished product barely resembles the first draft.

So, long story short, I'm going to do something a little bit different. Rather than just recall a blog post from the past of which I agreed or disagreed with in the future, I'm going to do this will a whole series of blog posts. Mainly, my series on Abstract Data Types: which covers Trees, Linked Lists, and Binary Search Trees.

In all of these posts, I tried to provide a short and concise explanation of what these abstract data types are. I tried to remain as self-aware as possible to the subtle distinction between the implementation of them, versus the sort of abstract definition they all have. And, as I pointed out in a later post, the abstract definition of an ADT is often hard to pinpoint, since its focused more on the relationship between the parts of the structure than on any one part of it. However, in all of these posts on abstract data types, I feel as though I focused too much on the abstract part, and not enough on the specific details of implementation. In my defense, most of the SLOG's I read seemed to focus on the implementation of the ADT disproportionately, so it seemed appropriate for me to do something different. And whats wrong with a little differentiation?

However, my approach hasn't really afforded me the ability to discuss recursion, and the role it has played in this course. Now, as Many of my peers have noted, Recursion is a very central element of this course. It's an element of our theoretical toolbox which makes it possible to efficiently solve all sorts of problems we previously assumed were tedious at best. The competence shift I gained when I added this tool to my arsenal was analogous to the jump I made when I first learned how to use iteration in CSC108. Its a little embarrassing, but I remember feeling like I could solve any problem thrown at me. Like that there was nothing left to learn. It was wonderful.

Recursion can be thought of as being even more basic than iteration. Now, iteration can be thought of as sequentially traveling across an array from left to right, correct? Well, using that intuition, we can think of recursion as a way of traveling from any one thing to another. It can simply be the next element of an array, as is the case in a Linked List. There can be two next elements in the array, and a function determining which one you will travel to (as is the case with a binary search tree, if you'll pardon my loose use of the words "element" and "array"). Really, in this sense, there is no real direction- there's just the "current thing" of analysis and the "next thing".

Now, consider a program which relies on recursive processes, like a program which collapses a list of elements containing sub-lists which are arbitrarily deeply nested into a list of depth one. In this program, the method is simple: evaluate every element of the list. If that element itself is a list,  reapply this function to it, which will again cause it to iterate over every element of that list. If it's not a list, than just put the element into some new list which is passed along the function.

Now, although this isn't an instance of iteration over some array like a "for loop", it certainly shares some important similarities to one. We start off with an initial input, and than the function takes it and returns a call to the function itself, except this time, with a different input. You can think of this sequence of inputs and outputs which become new inputs as a sequence in which, if the recursive function is defined properly, will end at some point and return the desired ending output. So although it is not just mere iteration- it certainly is similar since it is a sequential process- which is extremely reminiscent to the sort of sequential processing computers are known and loved for.

So, now it's probably much easier to see what I mean when I say that the majority of the ADT's we work with are contingent on recursion and recursive sub processes. I'm just happy I finally had a chance to revisit those old posts in order to make this point. As many of my peers have pointed out, recursion does seem like a major theme of this course. And I'm happy I got a chance to express my thoughts on it.

Friday 20 March 2015

Inserting and Deleting Nodes in a Binary Search Tree

Something I didn't mention in any of my posts on the non built-in ADT's like Trees, LinkedLists, and BST's was how much of a pain it can be to remove or insert the nodes in these structures. This topic is of interest to me, not because of how my SLOG marks are on the line they are useful for solving homework problems, but because of how they lead to a deeper understanding and appreciation of the ADT in question.

So, lets first look at an ordinary Tree. As I've said before, a tree is just a bunch of nodes- each of which has two attributes: "data", which is the value it stores which can be used for identification, and "children"- which is some sort of collection of nodes which can be accessed directly from said node. So, based on the implementation of a Tree we did all the way back in the week 6 tutorial, to be the child of a node in that tree would be to be an element of the list "children", which is an attribute of that node object. As a result, a parent node can access it's children nodes directly. However, the child node cannot point back to the parent, or a cycle will be formed, which would violate the definition of a tree. This tells us something extremely important about the internal structure of trees. In a profound sense, there is no one place that the relation between a child and its parent node is defined. It is a relation which relies on the way how the  two nodes interact with each other.

Now, when we add or remove a node from a tree, we run the risk of altering delicate relations like these. And in order to avoid doing this, we need to understand what constitutes this relation in the first place, especially in respect to the specific implementation of the ADT we're using. In the case of Trees, we need to make sure that the definition of a tree is preserved (each node has only one parent excluding the root, there's a direct path from the node back to the root, none of the nodes form a cycle, etc). This isn't fun, but its not impossible. For instance, to add a node, we just need to add them as leaves- that way, we don't need to worry about accidentally creating a cycle. We also need to set it's "children" attribute to something which signifies that it is a leaf (such as an empty list). Depending on the implementation, we may also need to update certain attributes dependent on the structure as a whole, such as the branching factor or height. Although linked-lists may appear easier to alter due to their constant branching factor of one, they often contain these types of attributes, such as length.

With BST's, a whole other problem comes up. Due to it's ability to utilize a binary search, the tree must also be "balanced". In a nut shell, the notion of balance is the idea that, for any node in the BST, the height and number of nodes in the sub-tree rooted by its left child is equal to that of the sub-tree rooted in its right child. Balance is very hard to preserve since, for starters, its very hard to even define to begin with. It's a relationship which isn't stored in any one node, or even a large collection of them, but in every single node at some level of analysis. The reason why we want balance is simple- if we are looking for an element in a BST, in the worst case, we want to always have 2 options available to us- the ability to go left or right. If any node in a BST lacks, say, a left sub-tree, than if we are trans-versing the tree, we can only go right (assuming, of course, that the element hasn't already been found and there are still children). In this case, we don't actually reduce our options. Rather than halving the problem space, as we aim to for each node in a BST, we just go through it sequentially. And although this may not seem like a big deal, just keep in mind that some BST's contain thousands of nodes. So if your tree has a lot of nodes with only one path instead of two, and those nodes are traversed, than for those periods, it will be as if you are using a simple linear search. And that could be really inefficient.

So, when inserting and deleting nodes from a BST, we must try and preserve this sense of balance. If we remove a node, we want to find a way to replace it with another node directly below it. And as mentioned earlier, when we insert a new node, we want to attach it as a leaf. That way, it doesn't take over the place of an entire sub-tree, or unnecessarily elongate a given path in it. And of course, all of these make sure the Binary search tree retains it's ability to do binary search

Wednesday 18 March 2015

Abstract Data Type: Binary Search Tree

I said I would cover a new type of Tree in my post about Tree ADT's, didn't I?

Anyways, today I'm focusing on Binary Search Trees, which are also called BST's. Now, to understand BST's, lets first make one thing clear- they are NOT the same thing as Binary Trees. I made this  mistake a lot when I first started working with them, and it made communicating my ideas about these concepts quite annoying. So in case I wasn't clear- let me reiterate this point one more time:
BT   =/=   BST

A Binary Tree- also called a BT- is simply a tree ADT which always has 2 positions for possible children- left and right. This means that, if you take any node in a BT, you can only access a maximum of two other nodes. Although this may seem restrictive at first, it also allows us to implement various types of search algorithms. And this is where BST's come in.

A BST is just a type of a BT with an additional rule; that the value of any node in the tree must be larger than the value of every node in the sub-tree who's root is it's left child. And the same is true for the sub-tree rooted in its right child, except with larger values instead. Confusing? Here's a better way of explaining it.

Remember, the topmost node of a tree (the one with no parent) is the root. Any node in a tree can be considered the root of its own tree- called a sub-tree of the original tree. In order to access a specific node in some tree- let's just say it's value is 7- we need to use a search algorithm, and we must always start at the root of the tree provided. Than, we must find some way to travel (or "transverse") the nodes of the trees until we find the node we want. Unlike a list, we do not simply travel over every node in the tree in a sequence until we find the one with the value we want. In fact, such a searching method is actually considered to be really inefficient. After all, you would have to evaluate every element in the list just to verify if the element isn't in the list.

However, there are more efficient search algorithms, such as binary search. Binary Search takes list already sorted in ascending order, and takes it's middle element. It than splits the list into two sub-lists- one containing every element left of the middle, and one for every element to the right. It than does a comparison. If the element your looking for is bigger, than go to the right sub-list and repeat the algorithm. If its smaller, it will go to the left sub-list and repeat. And if it's not bigger or smaller, than congrats, it can only be equal, and so you've found it!

As you have probably guessed, binary search trees are another way of implementing this search algorithm. However, rather than take a presorted list, it takes a BST. Since a BST, by definition, stores elements in such a manner so that the only elements in the left sub-tree of the node have values smaller than its own (and bigger with right), a similar algorithm can be implemented to find any piece of data. For instance, if the node you are looking for has a value of 5, and the value of the node you are evaluating is 7, you know that the node you are looking for must occur in the sub-tree to the left of the current node (if it's in the tree at all), since that sub-tree contains all the values smaller than 7. So, even in the worst case, this search algorithm will be significantly faster than a simple, linear search.

Thursday 12 March 2015

Abstract Data Type: Richard Linklist-er

In week 8, the topic we are forced required to write about is an ADT called a "linked-list". Which sounds an awful lot like the surname of the director "Richard Linklater". And since I just watched "Dazed and Confused", I wanted to reference this admittedly shallow relation between the two words. Alright, now to move on. In this post, I will try to explain, to the best of my abilities, what a linked list is.

A linked-list is, to be succinct, a collection of individual node objects, each with two attributes: one called "data", which is the actual piece of data being stored in the linked list at that position, and one called "nxt", which stores the next node in the sequence (or None if it's the last node of the linked list). The first node is stored in an overarching data structure called a "LinkedList". In short, the linked-list shell lets you find the first node, and each node (including the first) lets you access the very next one (or last one).

Now, on the surface, linked-lists may seem trivial- especially when compared to tree data structures. After all, the whole premise of a tree is that each node has multiple children nodes. So if you are at any node in the tree, you can directly access a large set of nodes on the tree without having to go through them all sequentially. This is why we often use them over lists when storing comparable types of data (a topic I will cover in a later post). Now, for any node in a linked-list, we can only access the element directly behind it. In this sense, it seems nearly indistinguishable from a tree with a branching factor of one, which kinda defeats the point (as we'll see in a later post). And from my correspondences with my peers, this sentiment isn't uncommon. And speaking of correspondences with my peers, here's a link to all my interactions with other SLOG's. Just throwing that out there, if you just so happen to be a marking TA...

Anyways, how can I convince all of them that linked lists are worth learning about? Well, lets think of them in a different way. In much the same way we learn the fundamentals of programming using a specific language (in this case python), we can also learn the fundamentals of many of our built in data types by understanding analogous, non-built in ones. In this case, I think a case can be made that understanding linked-lists gives us a better appreciation for what is perhaps my favorite built in data type- List.

List is an extremely useful data type for solving problems. You can put various types of objects in, and like magic, they doesn't disappear into the abyss of computer memory. You can also identify objects that you have previously put into the list (called elements), so that you can do things with them later. You can immediately identify the first and last elements of the list. You can also identify the element in the list with a certain "position" called an index. Once you have identified the element you desire, you can use it just like any ordinary object in the shell. For example, if its an int, you can apply the arithmetic operators over it. Perhaps the most important feature of a list is the ability to evaluate every single element in the collection sequentially, from one end to the other. This process, called an iteration over the list, is very commonly used in the implementation of various types of algorithms- such as searching and sorting. It's hard to imagine what programming would be like without this ability.

Interestingly, a linked list is capable of doing these exact things. If you don't believe me, than go back and re-read my definition for a linked-list. Recall how, for any node in a linked list, we can immediately access the node to its right- which is identified by the attribute "nxt". We can also identify the beginning of the linked list immediately, since that piece of data is stored in the linked list object itself. And, although not immediately obvious, we can identify the very last node of the sequence by simply identifying the node which has a "nxt" attribute set to "None". Because we can identify the first and last element or a linked list, as well as the next element from any given position, we care capable of iterating over a linked list, just like a regular list. In fact, with little effort, one can even write an algorithm which identifies elements in a linked-list based on their index number- all you need to do is identify the first node, access it's "nxt" attribute, and than add one to some counter, and repeat the process for the next node until the value of the counter is equal to the value inputted, and than return the value of that node. This process perfectly simulates the concept of indexing in a List (at least for index values greater than or equal to zero).

Rightly or wrongly, I think that linked-list's are far more relevant to computer science than many of my peers appear to. Now, this is not to say that linked lists have no pragmatic value at all, outside of being an educational aid. Perhaps linked-lists are more efficient at storing certain pieces of data, such as the length of the linked-list. The point I wish to make is that they are useful for another reason. Which is because they take a relatively complicated process, and break it down into much more palatable pieces. And that's something a computer science course can never do too much.

Saturday 28 February 2015

Summary of Object Oriented Programming Part 2

Since I already made a post on Object Oriented Programming concepts, and since the course syllabus is asking me to write a second, here is a follow up post covering a few more concepts I didn't get to cover the first time around.

Anyways, in my last post, I think I covered "what" object oriented programming is. In this one, I want to cover the "why". And the answer isn't at all obvious. I mean, many of the programs we've written thus far could have simply been written functionally, without a single class definition being written. However, to understand the value of objects will require that we first understand Abstract Data Types.

VI) Abstract Data Types:

Abstract Data Types are simply types of data. A type of data can be anything- such as a sequence of alphabetical characters spelling "I_love_you", or the number 4, or even a combination, like "no_i love_you_5ever". Data can also be something like the blueprints to a computer. Don't think too hard about the "data" part- think about the "abstract" part instead.

A big part of being abstract is to not just be a mere "instantiation" of something, but to be, in a way, connected to many types of things. Think back to our discussions on classes. A class, such as class "int", seems to be abstract, in that it isn't tied to any particular instance of the class (like the integer 4). Now, when I say "abstract" I do not mean "vague". Although both words seem to refer to properties not tied to individual instances of things, they are very different. Something which is abstract is, to some extent, well defined. For instance, a square. A square has 4 sides of equal length. This is true for all squares, and you would likely label the vast majority of things you see with four equal sides a "square". When something is vague, in contrast, it is not well defined. In fact, this is why it appears to be applicable to so many different kinds of problems, or able to be interpreted in so many different ways.

The key feature to object oriented programming, I think, is that it tries to take these sorts of abstract data types, and implement them in an intuitive way. I gives programmers the tools needed to do this herculean task- such as the ability to create classes, and to give special properties to any instances of the class. In this way, it affords programmers a plethora of ways to solve whichever given problem they are trying to tackle. Their theoretical toolbox is expanded to include nearly anything from their imagination.

Abstract Data Type: Tree

In week seven, we started working with a new abstract data type- trees. It was a notable experience- not just because of how useful they are, but because I worked with them in my "intro to syntactic patterns" course last semester. Oh, the nostalgia.

Anyways, if my understanding of trees is correct, they are simply a way of hierarchically storing data. There are relatively few parameters to worry about- it really just boils down to how "high" or "low" in the tree the data is relative to another piece of data. There are nodes- which are the elements in the tree which store the data, and there are branches- the lines which connect them and determine their relative positions. One Node must be the root which, contrary to it's name, is actually at the very top of the tree. In other words, there does not exist a node in the tree such that the root node would be a child or descendant to it (more on the definition of "child" later). Likewise, there are also leaves, which occur at the very bottom of the tree. The hierarchy of the tree is preserved by having a general rule, which is that for any given node (except the root), there must be only one node which is higher on the tree and there is a branch which connects it to that node. This superordinate node is referred to as a "parent", while the node itself is the "child". A node can have as many children as you desire- but each can only have 1 parent. In the end, this translates to a triangular object in which 1 node branches off downwards to a few more, than each of those translate down to a few more each, until you get the the bottom.

An example of a tree data type, pulled right from linguistics (and inspired by my first post), would be a syntactic tree. Syntactic trees are a method to model the way our minds organize the words and concepts behind them into, what we would deem "grammatical" sentences. It's the reason I cant spit just word soup out expect and you follow to it. Long story short, words have broad categories, and they are organized in a particular order in respect to those categories. Here is an image of a syntactic tree:



As you can see, different types of words seem to appear in different positions in the tree, based on certain relevant features. For instance, the phrase "is[cop] a mathematics machine" sits to the right of the node, "the computer". It appears that, whenever a property is predicated onto a subject, the phrase containing that property must occur to the right of it. Their position relative to each other on the sentence node "S" signifies which of the phrases is the subject and which is the predicate (at least, for simple active sentences).

I hope I have demonstrated a strong intuitive grasp of the idea behind tree structures, and their many applications. I'll write about them out later, but for now, I need sleep.


Syntactic Tree borrowed from article: Using nltk_lite’s chunk parser to detect prosodic phrase boundaries in the Aix-MARSEC corpus of spoken English. Although the article is focused on natural language processing and not linguistic theory per se, it still illustrates what a syntactic tree looks like.

Summary of Object Oriented Programming concepts ORIGINAL

I've edited and re-edited my "summary of object oriented programming concepts" blog post so many times, it's practically an entirely new blog post. So, I thought it would be an interesting idea to publish the original version. I hope you enjoy it!
_____________________________________________________________________________

 


I) Objects:

Perhaps the most important concept of object oriented programming is the titular objects themselves. Most python textbooks or websites will define an object as being a representation of a value, with a memory address that points to that value somewhere in the magical world of computer memory. For instance, the number 4 is an object. The memory address to the value of 4 is the same regardless of how we are pointing to that object, and any variable that refers to the object 4 will be considered identical (since they all point to the same value in computer memory, regardless of how that 4 was obtained).

>>> s = 2+2
>>> t = 4
>>> s is t
True

Note that the value to certain objects is not always the same. The mutable types, such as list objects, are a prime example. Even if we assign 2 variables the same value - an empty list- they will not be considered identical:

>>> t = []
>>> s = []
>>> s is t
False

Although its not obvious at first, the reason this happens is because the empty list s refers to is actually a different empty list than the one t refers to. If something is added to one, it will not be added to the other. However, they are still considered equivalent, as:


>>> s == t
True

As much as I love rambling about objects, I need to cover a few other concepts first. These concepts will incorporate and extend the concept of objects.


II) Classes:

A class is usually defined as a blueprint for building an object. Now, remember how we defined objects as values with associated memory addresses? Well, lets expand on that a little. Let us say that we are using a programming language which, somehow, has no classes. In this language, we could type in a sequence of alphabetical characters which just so happen to spell out the phrase "brown_nosing". Now what? Well, we could store it in a variable- but that's not very interesting. We could try to count every character in the sequence, or to return a copy of it in all caps. Perhaps we want to add the phrase "I_am_a" to the front of it. But how do we do those things? And worse off, what if we wanted to do these things to the phrase "you_are_an_amazing_person"? We would have to a lot of retyping, wouldn't we? However, what if we could create an infinite amount of sequences of alphabetical characters, which all have the same sorts of properties? Well, that's what classes are for. With a class like, oh I dunno, strings, we can create individual "string" instances, of which all share the same sorts of properties. For instance, whenever you put two strings together with a ' + ' in between, the two strings will be combined into one. This works with all strings, so now we can add the phrase "I_am_a" to "Brown_noser" or to "fan_of_you", without needlessly retyping things.

So, do be succinct, a class defines what all its members can do. And now to transition to a few more OOP concepts.


III) Properties of Class members

Now, what do I mean when I say that class members have properties? Well, think about how things in the outside world have properties. Consider an elephant. An elephant has things it can DO - such as  blow water out of its nose/snout thing. It also has things that represent what it IS- for instance, it has 4 legs and 1 long nose tubey-thing. Now, suppose we want to create a bunch of virtual elephants in our computer (because hey, Watson wrote a cook book so, why not?). If we make each elephant an object, which is a member of a class, than we need to have some way how the class determines what each elephant instance can do and what properties it has. These are the roles of methods and instance variables, respectively. Methods can be thought of as "instance functions", since they are functions that exist inside of every object which is a member of the class the method was defined in. Instance variables function in the same way, except that, for every property, it can either be a class variable (in which ever member of the class shares the variable with the exact same shared value initially), so they can be initialized- that is, the value of the instance variable is set as soon as an instance is created. How you chose to use both types of instance variables will depend of the type of problem you are trying to solve.


IV) Inheritance of properties

In much the same way that an object is an instance of a class- a class can also be an instance of another class. When this happens, some of the properties of the superordinate class can be shared by the subordinate "children" classes. This is inheritance.


V) Implications of the Instantiations of Classes:

Now, this is the last thing I want to cover. There are very technical terms I could also cover, such as the concept of property and the distinction between extending and overriding, but, in my opinion, they are really just variations of the same abstract concept of instantiation. When you instantiate, you create an object which is an instance of the class. This part should be clear by now. But think about the implications for a moment. This means that often, there is a one way relationship between class and the instance of the class. Although if may be the case that, if the object is an instance of class A, it gains such and such properties- having such and such properties isn't proof that all members of the class have those properties. In fact, often, we want to add to the properties of an object after it has been instantiated. The object does not stay the way it was initialized- in fact the reason it was initialized the way it was was to make modifying it easier.

The same thing applies to subclasses. In fact, you can completely override the properties of a subclass so that it functions completely differently than other subclasses of the original class (or the original class itself). Similarly, you can merely add properties to it- which is extending. In the end, whichever one of these techniques you utilize will depend on the problem at hand. In the workforce (from what I've been told), it is considered bad form to just completely change you code if other programs are working with it. In these cases, you would want to utilize extension, since the prior code would continue to run as intended, while newer code could utilize any of the newly added functionality.

Anyways, I hope that this was a suitable summary of some object oriented programming concepts. I wanted to cover what, in my opinion, was most fundamental to understanding the OOP philosophy, without going over too much. Well, that and I've been typing this for an hour, so now I need to sleep.

Sunday 8 February 2015

Summary of Object Oriented Programming concepts

I couldn't think of a better name. Anyways, here is a nifty little overview of some OOP concepts we've covered thus far in CSC148.


I) Objects:

Perhaps the most important concept of object oriented programming is the titular objects themselves. Most python textbooks or websites will define an object as being a representation of a value, with a memory address that points to that value somewhere in the magical world of computer memory. For instance, the number 4 is an object. The memory address to the value of 4 is the same regardless of how we are pointing to that object, and any variable that refers to the object 4 will be considered identical (since they all point to the same value in computer memory, regardless of how that 4 was obtained).

>>> s = 2+2
>>> t = 4
>>> s is t
True

Note that the value to certain objects is not always the same. The mutable types, such as list objects, are a prime example. Even if we assign 2 variables the same value - an empty list- they will not be considered identical:

>>> t = []
>>> s = []
>>> s is t
False

Although its not obvious at first, the reason this happens is because the empty list s refers to is actually a different empty list than the one t refers to. If something is added to one, it will not be added to the other. However, they are still considered equivalent, as:


>>> s == t
True

In order to better explain the philosophy behind objects, I must a few related concepts first.


II) Classes:

A class is usually defined as a blueprint for building an object. Now, remember how we defined objects as values with associated memory addresses? Well, lets expand on that a little.  Suppose we are using a programming language that has no classes. In this language, we could type in a sequence of alphabetical characters which just so happen to spell out the phrase "brown_nosing". Now what? Well, logically speaking, a human can do many things with it- like count every character in the sequence, or to create a copy of it in all caps. Or perhaps add the phrase "I_am_a_" to the front of it. We could simply create a new sequence of characters every single time- but that would be time consuming, wouldn't it? Plus, to make matters worse, we would have to completely create other, unrelated sequences of characters too, such as "you_are_an_amazing_person". Now, imagine if we could somehow take an already existing sequence of characters, and somehow be manipulated? Or better yet, what if every sequence of characters we create- regardless of which specific characters are used- have the potential to be manipulated in the same way? Well, that's what classes are for. With a class like- I dunno, strings- we can "string" objects- which are instances of the string class. All strings share the same sorts of basic properties, and can be manipulated in the same general way. For instance, whenever you put a "+" between two strings, they will be combined, or "concatenated", into a new one. This works with all instances of string objects, regardless of which specific characters appear in it, so now we can add the phrase "I_am_a" to the front of "Brown_noser" just as easily as we could to the phrase "fan_of_you".

In our hypothetical language without classes, there are no "defining features" to any given "thing" in a program. Every sequence of characters is considered a unique thing. However, we have an intuition in the world that certain things can be grouped together. For instance, when I visit my parents, I have this intuition that the little furry mammalian creature that jumps on me is, in some important sense, the same as the other little furry mammalian things I see in other peoples homes. I know that they can be grouped together in some meaningful sense, and that doing so can have great pragmatic value (such as making sure none of them consume chocolate). In my view, classes try to capture this intuition we have about the world. It allows us to group similar things together in a meaningful way, which can greatly improve the efficiency and organization of our programs.

And now, here's an uncreative transition to a few more OOP concepts.


III) Properties of Class members

Now, what do I mean when I say that all members of a class share properties? Well, try to thing, on an intuitive level, what kinds of properties things in the outside world share. Consider an elephant. An elephant has things it can DO- such as  blow water out of its nose/snout thing. It also has things that represent what it IS- for instance, it has four legs and a long, tube-like, nose things. Now, suppose we want to somehow create a bunch of virtual elephants in our computer (because hey, Watson wrote a cook book, so why not?). Now, if we make each elephant object an instance of some broad "elephant class", than this class could determine some of the attributes and abilities of each elephant instance. These are the roles of methods and instance variables, respectively. Methods can be thought of as "instance functions", since they are functions that exist inside of every object which is a member of the class the method was defined in. Instance variables operate in a similar way, except for variables which store values. Also of note is that these properties can be at either class level (in which every member of the class shares the exact same property initially), or at an instance level (in which the property is set when the instance is created, and usually varies between instances). Which types of properties you use, and on which level you use each of them, will depend of the type of problem you are trying to solve.


IV) Inheritance of properties

In much the same way that an object is an instance of a class- a class can also be an instance of another class. When this happens, some of the properties of the superordinate class can be shared by the subordinate "children" classes. This is inheritance.


V) Implications of the Instantiations of Classes:

Now, this is the last thing I want to cover. There are very technical terms I could also cover, such as the concept of property and the distinction between extending and overriding, but, in my opinion, they are really just variations of the same abstract concept of instantiation. When you instantiate, you create an object which is an instance of the class. This part should be clear by now. But think about the implications for a moment. This means that often, there is a one way causal relationship between a class and the instance of the class. For instance, if all instances of class A initially receive property B, than an object has property B doesn't necessitate that it is a member of class A.

The same thing applies to subclasses. In fact, you can completely override the properties of a subclass so that it functions completely differently than other subclasses of the original class (or the original class itself). So if every subclass of class A inherits property B, we can't even be sure that a given subclass of A will still have B. This is called "overriding" a class. Similarly, you can add new properties to it- which is called "extending" the class. So in this case, if a subclass of A has property C, we can't be sure that the original class A or any other subclasses will also have C. Again, the direction of causality is very limited. In the end, whichever one of these techniques you utilize will depend on the problem at hand. In the workforce (from what I've been told), it's considered bad form to completely chance you code if other programs are working with it. In these cases, you would want to utilize extension, since the prior code would continue to run as intended, while newer code could utilize the newly added functionality.

Anyways, I hope that this was a suitable summary of some object oriented programming concepts. I wanted to cover what, in my opinion, was most fundamental to understanding the OOP philosophy, without going over too much. Well, that and I've been typing this for an hour, so now I need to sleep.


EDIT this is one of the 3 blog posts that is intended to be graded.

Thursday 5 February 2015

Tracing Recursion: Thinking like a computer.

I like recursion. I don't know why it appeals to me- I just do. In fact, I used recursion in order to code the RPN calculator for the second lab. Without giving away too much, I made a method which would evaluate both operands and the operator of a binomial (written in RPN, of course). If the latter operand was itself a binomial, it would call itself to further break down the binomial into two operands and an operator. And than it keeps doing this until the most deeply nested binomial is broken down, and than evaluates all of them in that order. If I get permission from a TA, I'll post the code for it later on.

Anyways, to get back on topic,I find that tracing recursive code isn't all that different from tracing any other code- provided, of course, that you are used to the concept of recursion. All you need to do is constrain your thinking and just mindlessly follow the steps, without thinking about what the code is actually trying to accomplish. After all, this is what python does. It doesn't know what your program is trying to accomplish- even if it has a name like "count_list_depth". All it knows is that it has to follow the steps, no matter where they go. In a strange way, this is exactly how not to begin writing a program. Usually, if a programmer is needed, the problem isn't so simple that a small sequence of instructions can solve it. Usually, a sort of outline of the problem has to be devised, before any sort of solution can be attempted. Long story short- we need to be thinking at a high level of abstraction. Its not enough to know how to solve one or two parts of it- we need a skeleton which can account for everything. Only than can the details pertaining to implimentation be sorted out.


Anyways, I found the recursion tracing exercises to be surprisingly pleasant, and an invaluable source of practice. If I could give anyone advice though, it would be to get used to the syntax of recursive code- especially when it comes to list comprehensions. Worst comes to worst, you learned a new and more compact way to code list objects...

Monday 2 February 2015

CSC148 Term Test 1: To Do List

Last night I had this wonderful idea. The first term test is coming up soon, so why not blog a bit about it? Anyways, for your convenience, I've made a long and messy list of the concepts we'll be covering on this term test. I've also marked the concepts I still need to cover, so I can use it as a study guide during these next few... well... hours...


COURSE NOTES CHAPTER 1:

-Objects (information and behavior)
-ADT
-Implementation vs. Semantics
-Stacks (stores items, remove top, add to top, check if empty)
-expose interface
-interface
-clients
-docstring
-inheritence
-extension
-concretizing ADT
-classes
-operations and attributes of classes
-class instance
-initializing an instance
-Magic methods
-__eq__ (equivalence NOT identity)
-__init__
-equivolence vs identity
-__repr__
-__str__ (is set to __repr__ as default)
-Exception class
-privacy and property
-inheritence
-composition vs non composition
-subclass
-override vs extend
-inheritance (allows to override or extend)
-self
-__add__
-class exception
-raise
-try/except


LECTURE MATERIAL
-object oriented design
-help()
-client code
-interface
-public interface
-public
-magic methods __init__, __str__, __eq__
-overriding
-class object
-__repr__
-property
-eval()
-ADT
-stack ADT
-uses of stacks
-doctest
-parent class
-child class
-canonical use of inheritence
-extend vs override
-list comprehensions
-merge sort
-recursion
-recursive function
-tracing recursion
-measure of problem size (list depth, etc)
-formally thinking recursively

TUTORIAL MATERIALS
-object oriented design
-recipe for designing functions, methods, classes, etc
-tracing recursion
-writing recursive code
-pep8
-reading/writing to files


...Yes, I still need to cover pep8 and the CSC108 design recipe. I know how to use it to design functions, but I'm still not sure exactly how I should design methods, classes, etc, using those core principles. Should I worry about corner cases as much? Should I include anything in the class doc-string other than a core description or what the class represents and what public attributes are contained within it? Should it also list the public behaviors (methods) in the class? And speaking of privacy, I still cant think of any redundant, non-gimmicky ways to use it. At this state, I feel that it just makes our codes unnecessarily fatter and thus harder to read.

Well, that's it for now. Time to study.

Friday 30 January 2015

Thoughts about the first few weeks of this course.

The first few weeks of this course have certainly been demanding. Unlike most intro to computer science classes, this one has weekly quizzes and requires students to fill in a weekly blog (which, in case your wondering, is why I'm wrote this post. I love you, my dear reader, but not enough to do this without academic incentive). And while no one blog entry or quiz totally does me in- they do chip away at my stamina slowly.

Right now, my focus is on the first term test. The quizzes have given me at least some idea of what to expect, so I know what my strengths are (probably recursion) and what my weaknesses are (which I'm not sharing with you, although I'll give you a hint: it rhymes with "gunction design recipe").

Preparing for a CS midterm isn't like preparing for any other. You need to master the truly backwards art of programming without making stupid mistakes. And lets face it- which CS teacher hasn't made a minor programming mistake during a lecture at least once? I need to get used to writing pseudo code which would actually run if inputted into python. And since I've spent the majority of this week working on CSC165, I now gotta make sure I get enough practice in for this course.

That's all for today. I'll try to do another blog post before the first midterm.

Thursday 22 January 2015

Why Geeks Should Write

(the name wasn't my decision. Personally, I prefer the word "nerd", since I find it has more positive connotations...)


Greetings blogosphere. I am a computer science major at the University of Toronto, with an interest in Linguistics and Cognitive Science. As a part of my introduction to computer science course, I was required to start a blog. It's kinda like being sentenced to community service hours without having done anything wrong. On one hand, you know that it could potentially benefit many people (including the one writing). But on the other hand, there are far better things to do on a Thursday evening than write a blog post only a few people will ever read- one of which is only reading this because they are being forced to (and I thought my life sucked).

Anyways, I can't think of a better way to kick start a computer science blog than to go off on a tangent about how important writing skills are in CS. But first, I would like to address a more general question- what does it really mean to "improve" your writing skills? There are many prescriptive notions of what it means to write well. For instance, If you and me were to get into a discussion about good communication, you might be tempted to dock me some marks for "style" since I used the objective personal pronoun "me" in the subject position. But you wouldn't do a mean thing like that, would you?

In linguistics, one of the most important concepts is mutual intelligibility- the idea that there is a scale in which two people actually "understand" what each other are saying during a given speech instance. Basically, it can go from perfect clarity (where there is essentially a one to one mapping between the concepts in the mind of the speaker and the listener due to the magic of language), to no clarity at all- such as if anyone were to speak to me in a language other than English. In between these extremes, which is where we usually find ourselves, is ambiguity- where we aren't 100% sure if the message we got was what the speaker intended. Now, what does this have to do with computer science? Well, everything! There's a reason we call python a programming LANGUAGE after all. What we do when we program is very analogous to what we do cognitively when we try to communicate with a human being. The only difference, of course, is that computers lack our threshold for ambiguity. See, while humns ar rlatvly god at ndrstndng rlly ambgous sntnces, computers aren't. They lack much of the cognitive architecture humans seem to possess when it comes to the parsing process. Rather than engage in simultaneous top-down and bottom up processing, they just kinda process it all at once. Either they fully get it, or they fully don't. Strictly speaking, they'll never ask you to repeat yourself.

So why should "geeks" write? Because geeks (erm... computer programmers) need to get used to this notion of ambiguity. Our entire careers are contingent on being perfectly understood by a computer at every stage of every program we write. In fact, I recently was having trouble because I forgot to put a closing bracket on one of my print statements. And although the problem was petty- it took me nearly an hour to solve it! I was trying to tell python to print a message, but was not being clear about what it was I wanted it to print. It couldn't tell whether I wanted it to stop or print the rest of the file!

And that is why geeks should write

(As an aside- I purposely did not write anything about how important communication skills were in the workplace because I knew many people were already doing it. I figured this would be more unique. Differentiation is a good policy...)