Skip to content

Commit 46367d7

Browse files
authored
Add files via upload
1 parent 4487c3f commit 46367d7

File tree

3 files changed

+278
-0
lines changed

3 files changed

+278
-0
lines changed
Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
2+
# coding: utf-8
3+
4+
# In[1]:
5+
6+
7+
#its also called levenshtein distance
8+
#it can be done via recursion too
9+
#the recursion version is much more elegant yet less efficient
10+
# https://github.com/je-suis-tm/recursion/blob/master/edit%20distance%20recursion.jl
11+
12+
#edit distance is to minimize the steps transforming one string to another
13+
#the way to solve this problem is very similar is to knapsack
14+
#assume we have two strings text1 and text2
15+
#we build a matrix with the size of (length(text1)+1)*(length(text2)+1)
16+
17+
#there are three different ways to transform a string
18+
#insert,delete and replace
19+
#we can use any of them or combined
20+
#lets take a look at the best case first
21+
#assume string text1 is string text2
22+
#we dont need to do anything
23+
#so 0 steps would be the answer
24+
#for the worst case
25+
#when string text1 has nothing in common with string text2
26+
#we would have to replace the whole string text1
27+
#the steps become the maximum step which is max(length(text1),length(text2))
28+
#for general case,the number of steps would fall between the worst and the best case
29+
#assume we are at i th letter of string text1 and j th letter string text2
30+
#if we wanna get the optimal steps of transforming string text1 to string text2
31+
#we have to make sure at each letter transformation
32+
#text1[1:i] and text2[1:j] have reached their optimal status
33+
#otherwise,we could always find another combination of insert,delete and replace
34+
#to get a "real" optimal text1[1:i] and text2[1:j]
35+
#it would make our string transformation not so optimal any more
36+
#it is the same logic as the optimization of knapsack problem
37+
#after we set our logic straight
38+
#we would take a look at three different approaches
39+
#lets take a look at insertion
40+
#basically we need to insert j th letter from string text2 into string text1 at i th position
41+
#the cumulated steps we have taken should be matrix[i][j-1]+1
42+
#matrix[i][j-1] is the steps for text1[1:i] to text2[1:j]
43+
#for delete,it is vice versa
44+
#the cumulated steps we have taken should be matrix[i-1][j]+1
45+
#for replacement,it is a lil bit tricky
46+
#there are two scenarios
47+
#if text1[i-1]==text2[j-1]
48+
#it should be matrix[i-1][j-1]
49+
#we dont need any replacement at all
50+
#else,it should be matrix[i-1][j-1]+1
51+
#we replace i th letter of string text1 with j th letter of string text2
52+
#after we managed to understand three different approaches
53+
#we want to take the minimum number of steps among these three approaches
54+
#throughout the iteration of different positions of both strings
55+
#in the end,we would get the optimal steps to transform one string to another,YAY
56+
57+
58+
# In[2]:
59+
60+
61+
function edit_distance(text1,text2)
62+
63+
len1=length(text1)+1
64+
len2=length(text2)+1
65+
66+
#this part is to create a matrix of (length(text1)+1)*(length(text2)+1)
67+
matrix=[[0 for _ in 1:len2] for _ in 1:len1]
68+
69+
for i in 1:len1
70+
71+
matrix[i][1]=i-1
72+
73+
end
74+
75+
for i in 1:len2
76+
77+
matrix[1][i]=i-1
78+
79+
end
80+
81+
#we take iterations on both string text1 and text2
82+
#next,we check if text1[i-1]==text2[j-1]
83+
#if yes,no replacement needed
84+
#if no,replacement needed
85+
#we take a minimum function to see which combination would give the minimum steps
86+
#eventually we got what we are after
87+
for i in 2:len1
88+
89+
for j in 2:len2
90+
91+
if text1[i-1]==text2[j-1]
92+
93+
matrix[i][j]=min(matrix[i-1][j]+1,
94+
matrix[i][j-1]+1,
95+
matrix[i-1][j-1])
96+
97+
else
98+
99+
matrix[i][j]=min(matrix[i-1][j]+1,
100+
matrix[i][j-1]+1,
101+
matrix[i-1][j-1]+1)
102+
103+
end
104+
105+
end
106+
107+
end
108+
109+
return matrix[len1][len2]
110+
111+
end
112+
113+
114+
# In[3]:
115+
116+
117+
println(edit_distance("baiseé","bas"))
118+

edit distance recursion.jl

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
2+
# coding: utf-8
3+
4+
# In[1]:
5+
6+
7+
#explanation can be found in dynamic programming version
8+
# https://github.com/je-suis-tm/recursion/blob/master/edit%20distance%20dynamic%20programming.jl
9+
#the only problem with recursion is that it is so freaking slow
10+
#recursion is so inefficient in any programming language
11+
#although it looks much more elegant than dynamic programming
12+
13+
14+
# In[2]:
15+
16+
17+
function edit_distance(text1,text2)
18+
19+
if isempty(text1) || isempty(text2)
20+
21+
return max(length(text1),length(text2))
22+
23+
end
24+
25+
#we are comparing characters here
26+
#to get string, we should do end:end
27+
if text1[end]==text2[end]
28+
29+
replacement=0
30+
31+
else
32+
33+
replacement=1
34+
35+
end
36+
37+
steps=min(edit_distance(text1[1:end-1],text2)+1,
38+
edit_distance(text1,text2[1:end-1])+1,
39+
edit_distance(text1[1:end-1],text2[1:end-1])+replacement)
40+
41+
return steps
42+
43+
end
44+
45+
46+
# In[3]:
47+
48+
49+
println(edit_distance("arsehole","asshoe"))
50+

knapsack.jl

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,110 @@
1+
2+
# coding: utf-8
3+
4+
# In[1]:
5+
6+
7+
#this has nothing to do with recursion algorithm
8+
#it happened to be in the recursion chapter in my book
9+
#so i kept it under recursion
10+
#its about dynamic programming
11+
#its kinda tricky to understand
12+
#if you are familiar with convex optimization or lagrangian
13+
#its better to use em instead of this
14+
15+
16+
#knapsack problem is to maximize the value
17+
#while having a weight constraint
18+
#each value has a different weight
19+
#the knapsack has a maximum capacity which is the constraint
20+
21+
22+
#to solve the problem,we have to use recursive thinking
23+
#lets create a list from 1 to the maximum capacity
24+
#for each capacity in the list
25+
#we try to reach the optimal allocation of weight at the given capacity
26+
#say we have c as the maximum capacity
27+
#we remove the last item i
28+
#we get weight capacity-weight[i]
29+
#we wanna make sure our allocation at capacity-weight[i] is still the optimal
30+
#by optimal,we mean for the same weight we can achieve the highest value
31+
#if capacity-weight[i] is not the optimal
32+
#we can find another combo with the same weight but higher value
33+
#we add the item i back into the knapsack then
34+
#the new total value we get would be larger than the previous so-called optimal
35+
#it will contradict the definition of optimal
36+
#hence,for each capacity,we keep removing items
37+
#until we reach base case 0,and it always stays the optimal at the given capacity
38+
39+
40+
#to get the optimal status
41+
#we shall do a traversal on all items
42+
#we create a matrix with (number of items) * (maximum capacity)
43+
#for each capacity level,we try to add a new item
44+
#if adding new item causes the overall weight larger than the current capacity level
45+
#the knapsack reverts to the previous status without item i which is matrix[i-1][j]
46+
#if adding new item doesnt cause the overall weight bigger than the current capacity level
47+
#we try to see whether adding item i would be the new optimal case
48+
#so we compare the previous status with the status after adding item i
49+
#the status after adding item i shall be matrix[i-1][j-weight[i-1]]+value[i-1]
50+
#we use j-weight[i-1] cuz adding item i would reduce the capacity we have
51+
#we have to use the current constraint level j to minus item i weight
52+
53+
54+
# In[2]:
55+
56+
57+
function knapsack(value,weight,capacity)
58+
59+
#in this section,we create a nested list with size of (number of items+1)*(capacity+1)
60+
matrix=[[0 for _ in 1:(capacity+1)] for _ in 1:(length(value)+1)]
61+
62+
#now we begin our traversal on all elements in matrix
63+
#i starts from 2 cuz we would be using i-1 to imply item i
64+
for i in 2:(length(value)+1)
65+
66+
for j in 2:(capacity+1)
67+
68+
#this is the part to check if adding item i-1 would exceed the current capacity j
69+
#if it does,we go back to the previous status
70+
#if not,we shall find out whether adding item i-1 would be the new optimal
71+
if weight[i-1]>j
72+
73+
matrix[i][j]=matrix[i-1][j]
74+
75+
else
76+
77+
#julia index starts from 1
78+
#which is a pain in the ass
79+
#when current capacity==the new item s weight
80+
#it creates an issue
81+
if j==weight[i-1]
82+
83+
ind=1
84+
85+
else
86+
87+
ind=j-weight[i-1]
88+
89+
end
90+
91+
#we use max funcion to see if adding item i-1 would be the new optimal
92+
matrix[i][j]=max(matrix[i-1][j],
93+
matrix[i-1][ind]+value[i-1])
94+
95+
end
96+
97+
end
98+
99+
end
100+
101+
return matrix[length(value)+1][capacity+1]
102+
103+
end
104+
105+
106+
# In[3]:
107+
108+
109+
println(knapsack([0,50,60,60,120],[0,10,15,20,40],50))
110+

0 commit comments

Comments
 (0)